Neoepitope detection of disease using protein arrays

ABSTRACT

A diagnostic device for and method of detecting the presence of head and neck squamous cell carcinoma (HNSCC) in a patient including a detector device for detecting a presence of at least one marker indicative of HNSCC, the detector device including a panel of markers for HNSCC. A diagnostic device for and method of staging HNSCC in a patient including a detector device for detecting a presence of at least one marker indicative of stages of HNSCC, the detector device including a panel of markers for HNSCC. Markers for head and neck squamous cell carcinoma selected from the markers listed in Table 5. Methods of personalized immunotherapy, making a personalized anti-cancer vaccine, and predicting a clinical outcome in a HNSCC patient. A method of making a panel of HNSCC markers.

CROSS-REFERENCE TO RELATED APPLICATIONS

This Continuation-In-Part Application claims priority to U.S.Continuation-In-Part patent application Ser. No. 11/060,867, filed Feb.17, 2005, issued as U.S. Pat. No. 7,964,536 and U.S.Continuation-In-Part patent application Ser. No. 10/004,587, filed Dec.4, 2001, issued as U.S. Pat. No. 7,863,004, which is incorporated hereinby reference.

GRANT INFORMATION

Research in this application was supported in part by a grant from theNational Institute of Health (NIH Grant No. IR21CA100740-01) and a grantfrom MEDC (Grant No. MLSC 558). The Government has certain rights in theinvention.

STATEMENT REGARDING SEQUENCE LISTING

The sequence listing associated with this application is provided inelectronic text format and is hereby incorporated by reference into thespecification. The name of the text file containing the sequence listingis: 0788 5 AMEND SqncListAsEfiled111610.txt. The file is 208 KB and wascreated on Nov. 16, 2010.

BACKGROUND OF THE INVENTION

1. Technical Field

The present invention relates to an assay and method for diagnosingdisease. More specifically, the present invention relates to animmunoassay for use in diagnosing cancer.

2. Background Art

It is commonly known in the art that genetic mutations can be used fordetecting cancer. For example, the tumorigenic process leading tocolorectal carcinoma formation involves multiple genetic alterations(Fearon, et al. (1990) Cell 61, 759-767). Tumor suppressor genes such asp53, DCC and APC are frequently inactivated in colorectal carcinomas,typically by a combination of genetic deletion of one allele and pointmutation of the second allele (Baker, et al. (1989) Science 244,217-221; Fearon, et al. (1990) Science 247, 49-56; Nishisho, et al.(1991) Science 253, 665-669; and Groden, et al. (1991) Cell 66,589-600). Mutation of two mismatch repair genes that regulate geneticstability was associated with a form of familial colon cancer (Fishel,et al. (1993) Cell 75, 1027-1038; Leach, et al. (1993) Cell 75,1215-1225; Papadopoulos, et al. (1994) Science 263, 1625-1629; andBronner, et al. (1994) Nature 368, 258-261). Proto-oncogenes such as mycand ras are altered in colorectal carcinomas, with c-myc RNA beingoverexpressed in as many as 65% of carcinomas (Erisman, et al. (1985)Mol. Cell. Biol. 5, 1969-1976), and ras activation by point mutationoccurring in as many as 50% of carcinomas (Bos, et al. (1987) Nature327, 293-297; and Forrester, et al. (1987) Nature 327, 298-303). Otherproto-oncogenes, such as myb and neu are activated with a much lowerfrequency (Alitalo, et al. (1984) Proc. Natl. Acad. Sci. USA 81,4534-4538; and D'Emilia, et al. (1989) Oncogene 4, 1233-1239). No commonseries of genetic alterations is found in all colorectal tumors,suggesting that a variety of such combinations can be able to generatethese tumors.

Increased tyrosine phosphorylation is a common element in signalingpathways that control cell proliferation. The deregulation of proteintyrosine kinases (PTKS) through overexpression or mutation has beenrecognized as an important step in cell transformation andtumorigenesis, and many oncogenes encode PTKs (Hunter (1989) inoncogenes and the Molecular Origins of Cancer, ed. Weinberg (Cold SpringHarbor Laboratory Press, Cold Spring Harbor, N.Y.), pp. 147-173).Numerous studies have addressed the involvement of PTKs in humantumorigenesis. Activated PTKs associated with colorectal carcinomainclude c-neu (amplification), trk (rearrangement), and c-src and c-yes(mechanism unknown) (D'Emilia, et al. (1989), ibid; Martin-Zanca, et al.(1986) Nature 3, 743-748; Bolen, et al. (1987) Proc. Natl. Acad. Sci.USA 84, 2251-2255; Cartwright, et al. (1989) J. Clin. Invest. 83,2025-2033; Cartwright, et al. (1990) Proc. Natl. Acad. Sci. USA 87,558-562; Talamonti, et al. (1993) J. Clin. Invest. 91, 53-60; and Park,et al. (1993) Oncogene 8, 2627-2635).

Mutations, such as those disclosed above can be useful in detectingcancer. However, there have been few advancements which can repeatablybe used in diagnosing cancer prior to the existence of a tumor.Approximately 40,500 new cases of head and neck squamous cell carcinoma(HNSCC) will be diagnosed in the United States and 11,170 Americans willdie from this disease in the year 2006. Worldwide, HNSCC is the sixthmost common malignancy with incidence of 644,000 new cases a year.Despite progress in diagnostic and treatment modalities in the past 30years, long-term survival for patients affected by HNSCC has notsignificantly improved. One major impediment to improving survival inthis patient population is the failure to detect this cancer at an earlystage. More than two-thirds of patients with HNSCC are diagnosed at anadvanced stage when the five-year survival is less than 40%. In manycases, these patients are offered radical treatments that often resultin significant physical disfigurement as well as dysfunction of speech,breathing, and swallowing. The plight of these patients with advancedstage disease is in distinct contrast to that of patients who arediagnosed early. Early stage HNSCC patients have an excellent five-yearsurvival rate of more than 80% and experience significantly less impacton their quality of life after treatment with single modality therapy.This dramatic difference in survival and quality of life underlies theimportance of early detection in this disease.

Early detection can be achieved by screening patients at high risk fordevelopment of cancer. Although the American Cancer Society has issuedguidelines for screening of breast, colon, prostate, and uterinecancers, no such guideline exists for HNSCC. This is especiallyunfortunate given that patients at increased risk for development ofHNSCC can be easily identified (history of excess alcohol and/or tobaccouse) and targeted for screening. Early detection can also be improved byreducing diagnostic delays, reported to be between 3 to 5.6 months.Misdiagnosis at initial presentation to primary care physicians iscommon (44% to 63%) and may be due to the nonspecific nature ofpresenting symptoms (sore throat, hoarseness, ear pain, etc) as well asthe technical difficulty of examination in the head and neck region. Thedelay in diagnosis and referral to specialist has a significant negativeimpact on patient outcome and survival. Thus, there exists a need for asimple, noninvasive, and inexpensive test, widely accessible tophysicians in the primary care setting, which can be used to screen (inasymptomatic patients) and diagnose (in symptomatic patients) HNSCC inhigh risk population to improve early detection.

Methods for detecting and measuring cancer markers have been recentlyrevolutionized by the development of immunological assays, particularlyby assays that utilize monoclonal antibody technology. Previously, manycancer markers could only be detected or measured using conventionalbiochemical assay methods, which generally require large test samplesand are therefore unsuitable in most clinical applications. In contrast,modern immunoassay techniques can detect and measure cancer markers inrelatively much smaller samples, particularly when monoclonal antibodiesthat specifically recognize a targeted marker protein are used.Accordingly, it is now routine to assay for the presence or absence,level, or activity of selected cancer markers by immunohistochemicallystaining tissue specimens obtained via conventional biopsy methods.Because of the highly sensitive nature of immunohistochemical staining,these methods have also been successfully employed to detect and measurecancer markers in smaller, needle biopsy specimens which require lessinvasive sample gathering procedures compared to conventional biopsyspecimens. In addition, other immunological methods have been developedand are now well known in the art that allow for detection andmeasurement of cancer markers in non-cellular samples such as serum andother biological fluids from patients. The use of these alternativesample sources substantially reduces the morbidity and costs of assayscompared to procedures employing conventional biopsy samples, whichallows for application of cancer marker assays in early screening andlow risk monitoring programs where invasive biopsy procedures are notindicated.

For the purpose of cancer evaluation, the use of conventional or needlebiopsy samples for cancer marker assays is often undesirable, because aprimary goal of such assays is to detect the cancer before it progressesto a palpable or detectable tumor stage. Prior to this stage, biopsiesare generally contraindicated, making early screening and low riskmonitoring procedures employing such samples untenable. Therefore, thereis a general need in the art to obtain samples for cancer marker assaysby less invasive means than biopsy, for example, by serum withdrawal.

Efforts to utilize serum samples for cancer marker assays have met withlimited success, largely because the targeted markers are either notdetectable in serum, or because telltale changes in the levels oractivity of the markers cannot be monitored in serum. In addition, thepresence of cancer markers in serum probably occurs at the time ofmicro-metastasis, making serum assays less useful for detectingpre-metastatic disease. Serological analysis of recombinant cDNAexpression libraries (SEREX) of tumors with autologous serum is a wellestablished technique which has been used successfully to identify manyrelevant tumor antigens. However, many of the SEREX antigens identifiedfrom a specific cancer patient reacted only with autologous serumantibodies from that particular patient and tend to recognize antibodiesonly at a low frequency in sera from other cancer patients. Thus,clinical tests based on these SEREX antigens that are patient-specificrather than cancer-specific are insufficient for early detection ofcancer.

In view of the above, an important need exists in the art for morewidely applicable, non-invasive methods and materials to obtainbiological samples for use in evaluating, diagnosing and managing breastand other diseases including cancer, particularly for screening earlystage, nonpalpable tumors. A related need exists for methods andmaterials that utilize such readily obtained biological samples toevaluate, diagnose and manage disease, particularly by detecting ormeasuring selected cancer markers, or panels of cancer markers, toprovide highly specific, cancer prognostic and/or treatment-relatedinformation, and to diagnose and manage pre-cancerous conditions, cancersusceptibility, bacterial and other infections, and other diseases.

Autoantibodies against cancer-specific antigens have been identified incancers of the colon, breast, kidney, lung, ovarian, and head and neck.Immune response with antibody production may be elicited due to theover-expression of cellular proteins such as Her2, the expression ofmutated forms of cellular protein such as mutated p53, or the aberrantexpression of tissue-restricted gene products such as cancer-testisantigens by cancer cells. Because these autoantibodies are raisedagainst specific antigens from the cancer cells, the detection of theseantibodies in a patient's serum can be exploited as diagnosticbiomarkers of cancer in that particular patient. Further, the immunesystem is especially well adapted for the early detection of cancersince it can respond to even low levels of an antigen by mounting a veryspecific and sensitive antibody response. Thus, the use of immuneresponse as a biosensor for early detection of cancer throughserum-based assay holds great potential as an ideal screening anddiagnostic tool.

With specific regard to such assays, specific antibodies can only bemeasured by detecting binding to their antigen or a mimic thereof.Although certain classes of immunoglobulins containing the antibodies ofinterest can, in some cases, be separated from the sample prior to theassay (Decker, et al., EP 0,168,689 A2), in all assays, at least someportion of the sample immunoglobulins are contacted with antigen. Forexample, in assays for specific IgM, a portion of the total IgM can beadsorbed to a surface and the sample removed prior to detection of thespecific IgM by contacting with antigen. Binding is then measured bydetection of the bound antibody, detection of the bound antigen ordetection of the free antigen.

For detection of bound antibody, a labeled anti-human immunoglobulin orlabeled antigen is normally allowed to bind antibodies that have beenspecifically adsorbed from the sample onto a surface coated with theantigen, Bolz, et al., U.S. Pat. No. 4,020,151. Excess reagent is washedaway and the label that remains bound to the surface is detected. Thisis the procedure in the most frequently used assays, or example, forhepatitis and human immunodeficiency virus and for numerousimmunohistochemical tests, Nakamura, et al., Arch Pathol Lab Med1122:869-877 (1988). Although this method is relatively sensitive, it issubject to interference from non-specific binding to the surface bynon-specific immunoglobulins that can not be differentiated from thespecific immunoglobulins.

Another method of detecting bound antibodies involves combining thesample and a competing labeled antibody, with a support-bound antigen,Schuurs, et al., U.S. Pat. No. 3,654,090. This method has itslimitations because antibodies in sera bind numerous epitopes, makingcompetition inefficient.

For detection of bound antigen, the antigen can be used in excess of themaximum amount of antibody that is present in the sample or in an amountthat is less than the amount of antibody. For example,radioimmunoprecipitation (“RIP”) assays for GAD autoantibodies have beendeveloped and are currently in use, Atkinson, et al., Lancet335:1357-1360 (1990). However, attempts to convert this assay to anenzyme linked immunosorbent assay (“ELISA”) format have not beensuccessful. The RIP assay is based on precipitation of immunoglobulinsin human sera, and led to the development of a radioimmunoassay (“RIA”)for GAD autoantibodies. In both the RIP and the RIA, the antigen isadded in excess and the bound antigen:antibody complex is precipitatedwith protein A-Sepharose. The complex is then washed or furtherseparated by electrophoresis and the antigen in the complex is detected.

Other precipitating agents can be used such as rheumatoid factor or C1q,Masson, et al., U.S. Pat. No. 4,062,935; polyethylene glycol, Soeldner,et al., U.S. Pat. No. 4,855,242; and protein A, Ito, et al., EP0,410,893 A2. The precipitated antigen can be measured to indicate theamount of antibody in the sample; the amount of antigen remaining insolution can be measured; or both the precipitated antigen and thesoluble antigen can be measured to correct for any labeled antigen thatis non-specifically precipitated. These methods, while quite sensitive,are all difficult to carry out because of the need for rigorousseparation of the free antigen from the bound complex, which requires ata minimum filtration or centrifugation and multiple washing of theprecipitate.

Alternatively, detection of the bound antigen can be employed when theamount of antigen is less than the maximum amount of antibody. Normally,that is carried out using particles such as latex particles orerythrocytes that are coated with the antigen, Cambiaso, et al., U.S.Pat. No. 4,184,849 and Uchida, et al., EP 0,070,527 A1. Antibodies canspecifically agglutinate these particles and can then be detected bylight scattering or other methods. It is necessary in these assays touse a precise amount of antigen as too little antigen provides an assayresponse that is biphasic and high antibody titers can be read asnegative, while too much antigen adversely affects the sensitivity. Itis therefore necessary to carry out sequential dilutions of the sampleto assure that positive samples are not missed. Further, these assaystend to detect only antibodies with relatively high affinities and thesensitivity of the method is compromised by the tendency for all of thebinding sites of each antibody to bind to the antigen on the particle towhich it first binds, leaving no sites for binding to the otherparticle.

For assays in which the free antigen is detected, the antigen can alsobe added in excess or in a limited amount although only the former hasbeen reported. Assays of this type have been described where an excessof antigen is added to the sample, the immunoglobulins are precipitated,and the antigen remaining in the solution is measured, Masson, et al.,supra and Soeldner, et al., supra. These assays are relativelyinsensitive because only a small percentage change in the amount of freeantigen occurs with low amounts of antibody, and this small percentageis difficult to measure accurately.

Practical assays in which the free antigen is detected and the antigenis not present in excess of the maximum amount of antibody expected in asample have not been described. However, in van Erp, et al., Journal ofImmunoassay 12(3):425-443 (1991), a fixed concentration of monoclonalantibody was incubated with a concentration dilution series of antigen,and free antigen was then measured using a gold sol particleagglutination immunoassay to determine antibody affinity constants.

There has been much research in the area of evaluating useful markersfor determining the risk factor for patients developing IDDM. Theseinclude insulin autoantibodies, Soeldner, et al., supra and circulatingautoantibodies to glutamic acid decarboxylase (“GAD”), Atkinson, et al.,PCT/US89/05570 and Tobin, et al., PCT/US91/06872. In addition, Rabin, etal., U.S. Pat. No. 5,200,318 describes numerous assay formats for thedetection of GAD and pancreatic islet cell antigen autoantibodies. GADautoantibodies are of particular diagnostic importance because theyoccur in preclinical stages of the disease, which can make therapeuticintervention possible. However, the use of GAD autoantibodies as adiagnostic marker has been impeded by the lack of a convenient,nonisotopic assay.

One assay method involves incubating a support-bound antigen with thesample, then adding a labeled anti-human immunoglobulin. This is thebasis for numerous commercially available assay kits for antibodies suchas the Syn ELISA kit which assays for autoantibodies to GAD65, and isdescribed in product literature entitled “Syn^(ELISA) GAD II-Antibodies”(Elias USA, Inc.). Substantial dilution of the sample is requiredbecause the method is subject to high background signals from adsorptionof non-specific human immunoglobulins to the support.

Many of the assays described above involve detection of antibody thatbecomes bound to an immobilized antigen. This can have an adverse affecton the sensitivity of the assay due to difficulty in distinguishingbetween specific immunoglobulins and other immunoglobulins in thesample, which bind non-specifically to the immobilized antigen. There isnot only a need to develop an assay that avoids non-specific detectionof immunoglobulins, but there is also the need for an improved method ofdetecting antibodies that combines the sensitivity advantage ofimmunoprecipitation assays with a simplified protocol. Finally, assaysthat can help evaluate the risk of developing diseases are medically andeconomically very important. The present invention addresses theseneeds.

SUMMARY OF THE INVENTION

The present invention provides for a diagnostic device for use indetecting the presence of HNSCC in a patient including a detector devicefor detecting a presence of at least one marker indicative of HNSCC, thedetector device including a panel of markers for HNSCC.

The present invention also provides for a diagnostic device for use instaging HNSCC in a patient including a detector device for detecting apresence of at least one marker indicative of stages of HNSCC, thedetector device including a panel of markers for HNSCC.

Markers for HNSCC are provided selected from the markers listed in Table5.

A method of diagnosing HNSCC is provided, including the steps ofdetecting markers in the serum of a patient indicative of the presenceof HNSCC, and diagnosing the patient with HNSCC.

A method of staging HNSCC is provided, including the steps of detectingmarkers in the serum of a patient indicative of a stage of HNSCC, anddetermining the stage of HNSCC. A method of staging cancer is alsosimilarly provided.

Further provided is a method of personalized immunotherapy, includingthe steps of detecting markers in the serum of a patient with the abovediagnostic device, analyzing reactivity of markers in the serum tomarkers in the panel, identifying markers in the serum with the highestreactivity, and using the markers identified as immunotherapeutic agentspersonalized to the immunoprofile of the patient. A method ofpersonalized targeted therapy is similarly provided.

A method of making a personalized anti-cancer vaccine is provided,including the steps of detecting markers in the serum of a patient withthe above diagnostic device, analyzing reactivity of markers in theserum to markers in the panel, identifying markers in the serum with thehighest reactivity, and formulating an anti-cancer vaccine using theidentified markers.

Also provided is a method of predicting a clinical outcome in a HNSCCpatient, including the steps of analyzing a pattern of reactivity of apatient's serum with a panel of HNSCC markers, and predicting a clinicaloutcome.

A method of making a panel of head and neck squamous cell carcinomamarkers is provided, including the steps of creating HNSCC cDNAlibraries, replicating the HNSCC cDNA libraries, performing differentialbiopanning, selecting clones to be arrayed on a protein microarray,immunoreacting the microarray against HNSCC patient serum, and selectinghighly reactive clones for placement on a panel.

DESCRIPTION OF THE DRAWINGS

Other advantages of the present invention are readily appreciated as thesame becomes better understood by reference to the following detaileddescription when considered in connection with the accompanying drawingswherein:

FIG. 1 shows the matrix of reactivity between sets of clones coming frompatients 1-12 (in rows) and sera from same patients (in columns). Atthis point (step 2 of Procedure 2), the matrix contains the results ofthe self-reactions: patients 1-10 have a specific self-reaction whereaspatients 11 and 12 do not, and patients 11 and 12 are eliminated fromthe clone selection procedure;

FIG. 2 shows a matrix of reactivity between sources of clones anddifferent sera ordered by reactivity; the clones from patient 2 reactwith sera from self (column 2) and patients 4 and 8; the clones frompatient 3 react with sera from self (column 3) and patients 6 and 10,etc. Note that the union of the set of clones coming from patients 2, 3,5, 7 and 1 ensures that the chip made with these clones reacts with allpatients;

FIG. 3 is a schema showing the process of the present invention;

FIGS. 4A and 4B are photographs showing phage clones spotted inreplication of six in an ordered array onto nitrocellulose coated glassslides;

FIGS. 5A and 5B are photographs showing protein microarraysimmunoreacted against a serum sample from HNSCC patient (FIG. 5A) andserum from a control patient (FIG. 5B);

FIG. 6 is a schema showing the process of calculating the real accuracyof the process of the present invention; and

FIGS. 7A and 7B are graphs showing the link between the immunoreactionlevel of the different clones and the class membership of the samples(cancer versus healthy).

DETAILED DESCRIPTION OF THE INVENTION

Generally, the present invention provides a method and markers for usein detecting disease and stages of disease. In other words, the markerscan be used to determine the presence of disease without requiring thepresence of symptoms. Particularly, the present invention provides amethod and markers for use in detecting HNSCC and stages of HNSCC, aswell as various other diagnostic methods, which are further describedbelow.

The present invention can further be understood in light of thefollowing terms and definitions.

By “bodily fluid” as used herein it is meant any bodily fluid known tothose of skill in the art to contain antibodies therein. Examplesinclude, but are not limited to, blood, saliva, tears, spinal fluid,serum, and other fluids known to those of skill in the art to containantibodies.

By “biopanning”, it is meant a selection process for use in screening alibrary (Parmley and Smith, Gene, 73:308 (1988); Noren, C. J., NEBTranscript, 8(1); 1 (1996)). Biopanning is carried out by incubatingphages encoding the peptides with a plate coated with the proteins,washing away the unbound phage, eluting, and amplifying the specificallybound phage. Those skilled in the art readily recognize otherimmobilization schemes that can provide equivalent technology, such asbut not limited to binding the proteins or other targets to beads.

By “staging” the disease, as for example in cancer, it is intended toinclude determining the extent of a cancer, especially whether thedisease has spread from the original site to other parts of the body.The stages can range from 0 to 5 with 0 being the presence of cancerouscells and 5 being the spread of the cancer cells to other parts of thebody including the lymph nodes. Further, the staging can indicate thestage of a borderline histology. A borderline histology is a lessmalignant form of disease. Additionally, staging can indicate a relapseof disease, in other words the reoccurrence of disease.

The term “marker” as used herein is intended to include, but is notlimited to, a gene or a piece of a gene which codes for a protein, aprotein such as a fusion protein, open reading frames such as ESTs,epitopes, mimotopes, antigens, and any other indicator of immuneresponse. Each of these terms is used interchangeably to refer to amarker. The marker can also be used as a predictor of disease or therecurrence of disease.

“Mimotope” refers to a random peptide epitope that mimics a naturalantigenic epitope during epitope presentation, which is further includedin the invention. Such mimotopes are useful in the applications andmethods discussed below. Also included in the present invention is amethod of identifying a random peptide epitope. In the method, a libraryof random peptide epitopes is generated or selected. The library iscontacted with an anti-antibody. Mimotopes are identified that arespecifically immunoreactive with the antibody. Sera (containing antiantibodies) or antibodies generated by the methods of the presentinvention can be used. Random peptide libraries can, for example, bedisplayed on phage (phagotopes) or generated as combinatorial libraries.

“Antibody” refers to a polypeptide comprising a framework region from animmunoglobulin gene or fragments thereof that specifically binds andrecognizes an antigen. The recognized immunoglobulin genes include thekappa, lambda, alpha, gamma, delta, epsilon, and mu constant regiongenes, as well as the various immunoglobulin diversity/joining/variableregion genes. Light chains are classified as either kappa or lambda.Heavy chains are classified as gamma, mu, alpha, delta, or epsilon,which in turn define the immunoglobulin classes, IgG, IgM, IgA, IgD andIgE, respectively.

An exemplary immunoglobulin (antibody) structural unit comprises atetramer. Each tetramer is composed of two identical pairs ofpolypeptide chains, each pair having one “light” (about 25 kDa) and one“heavy” chain (about 50-70 kDa). The N-terminus of each chain defines avariable region of about 100 to 110 or more amino acids primarilyresponsible for antigen recognition. The terms variable light chain(V_(L)) and variable heavy chain (V_(H)) refer to these light and heavychains respectively.

Antibodies exist, e.g., as intact immunoglobulins or as a number ofwell-characterized fragments produced by digestion with variouspeptidases. Thus, for example, pepsin digests an antibody below thedisulfide linkages in the hinge region to produce F(ab)′₂, a dimer ofFab which itself is a light chain joined to V_(H)—C_(H) 1 by a disulfidebond. The F(ab)′₂ can be reduced under mild conditions to break thedisulfide linkage in the hinge region, thereby converting the F(ab)′₂dimer into an Fab′ monomer. The Fab′ monomer is essentially Fab withpart of the hinge region (see Fundamental Immunology (Paul ed., 3d ed.1993). While various antibody fragments are defined in terms of thedigestion of an intact antibody, one of skill can appreciate that suchfragments can be synthesized de novo either chemically or by usingrecombinant DNA methodology. Thus, the term antibody, as used herein,also includes antibody fragments either produced by the modification ofwhole antibodies, or those synthesized de novo using recombinant DNAmethodologies (e.g., single chain Fv) or those identified using phagedisplay libraries (see, e.g., McCafferty, et al., Nature 348:552-554(1990)).

For preparation of monoclonal or polyclonal antibodies, any techniqueknown in the art can be used (see, e.g., Kohler & Milstein, Nature256:495-497 (1975); Kozbor, et al., Immunology Today 4: 72 (1983); Cole,et al., pp. 77-96 in Monoclonal Antibodies and Cancer Therapy (1985)).Techniques for the production of single chain antibodies (U.S. Pat. No.4,946,778) can be adapted to produce antibodies to polypeptides of thisinvention. Also, transgenic mice, or other organisms such as othermammals, can be used to express humanized antibodies. Alternatively,phage display technology can be used to identify antibodies andheteromeric Fab fragments that specifically bind to selected antigens(see, e.g., McCafferty, et al., Nature 348:552-554 (1990); Marks, etal., Biotechnology 10:779-783 (1992)).

A “chimeric antibody” is an antibody molecule in which (a) the constantregion, or a portion thereof, is altered, replaced or exchanged so thatthe antigen binding site (variable region) is linked to a constantregion of a different or altered class, effector function and/orspecies, or an entirely different molecule which confers new propertiesto the chimeric antibody, e.g., an enzyme, toxin, hormone, growthfactor, drug, etc.; or (b) the variable region, or a portion thereof, isaltered, replaced or exchanged with a variable region having a differentor altered antigen specificity.

The term “immunoassay” is an assay wherein an antibody specificallybinds to an antigen. The immunoassay is characterized by the use ofspecific binding properties of a particular antibody to isolate, target,and/or quantify the antigen. In addition, an antigen can be used tocapture or specifically bind an antibody.

The phrase “specifically (or selectively) binds” to an antibody or“specifically (or selectively) immunoreactive with,” when referring to aprotein or peptide, refers to a binding reaction that is determinativeof the presence of the protein in a heterogeneous population of proteinsand other biologics. Thus, under designated immunoassay conditions, thespecified antibodies bind to a particular protein at least two times thebackground and do not substantially bind in a significant amount toother proteins present in the sample. Specific binding to an antibodyunder such conditions can require an antibody that is selected for itsspecificity for a particular protein. For example, polyclonal antibodiesraised to modified β-tubulin from specific species such as rat, mouse,or human can be selected to obtain only those polyclonal antibodies thatare specifically immunoreactive, e.g., with β-tubulin modified atcysteine 239 and not with other proteins. This selection can be achievedby subtracting out antibodies that cross-react with other molecules.Monoclonal antibodies raised against modified β-tubulin can also beused. A variety of immunoassay formats can be used to select antibodiesspecifically immunoreactive with a particular protein. For example,solid-phase ELISA immunoassays are routinely used to select antibodiesspecifically immunoreactive with a protein (see, e.g., Harlow & Lane,Antibodies, A Laboratory Manual (1988), for a description of immunoassayformats and conditions that can be used to determine specificimmunoreactivity). Typically a specific or selective reaction can be atleast twice background signal or noise and more typically more than 10to 100 times background.

A “label” or a “detectable moiety” is a composition detectable byspectroscopic, photochemical, biochemical, immunochemical, or chemicalmeans. For example, useful labels include ³²P, fluorescent dyes, iodine,electron-dense reagents, enzymes (e.g., as commonly used in an ELISA),biotin, digoxigenin, or haptens and proteins for which antisera ormonoclonal antibodies are available, e.g., by incorporating a radiolabelinto the peptide, or any other label known to those of skill in the art.

A “labeled antibody or probe” is one that is bound, either covalently,through a linker or a chemical bond, or noncovalently, through ionic,van der Waals, electrostatic, or hydrogen bonds to a label such that thepresence of the antibody or probe can be detected by detecting thepresence of the label bound to the antibody or probe.

The terms “isolated” “purified” or “biologically pure” refer to materialthat is substantially or essentially free from components that normallyaccompany it as found in its native state. Purity and homogeneity aretypically determined using analytical chemistry techniques such aspolyacrylamide gel electrophoresis or high performance liquidchromatography. A protein that is the predominant species present in apreparation is substantially purified. The term “purified” denotes thata nucleic acid or protein gives rise to essentially one band in anelectrophoretic gel. Particularly, it means that the nucleic acid orprotein is at least 85% pure, optionally at least 95% pure, andoptionally at least 99% pure.

The term “recombinant” when used with reference, e.g., to a cell, ornucleic acid, protein, or vector, indicates that the cell, nucleic acid,protein or vector, has been modified by the introduction of aheterologous nucleic acid or protein or the alteration of a nativenucleic acid or protein, or that the cell is derived from a cell somodified. Thus, for example, recombinant cells express genes that arenot found within the native (non-recombinant) form of the cell orexpress native genes that are otherwise abnormally expressed, underexpressed or not expressed at all.

An “expression vector” is a nucleic acid construct, generatedrecombinantly or synthetically, with a series of specified nucleic acidelements that permit transcription of a particular nucleic acid in ahost cell. The expression vector can be part of a plasmid, virus, ornucleic acid fragment. Typically, the expression vector includes anucleic acid to be transcribed operably linked to a promoter.

By “support or surface” as used herein, the term is intended to include,but is not limited to a solid phase which is typically a support orsurface, which is a porous or non-porous water insoluble material thatcan have any one of a number of shapes, such as strip, rod, particle,including beads and the like. Suitable materials are well known in theart and are described in, for example, U.S. Pat. No. 5,185,243 toUllman, et al., columns 10-11, U.S. Pat. No. 4,868,104 to Kurn, et al.,column 6, lines 2142 and U.S. Pat. No. 4,959,303 to Milburn, et al.,column 6, lines 14-31, which are incorporated herein by reference.Binding of ligands and receptors to the support or surface can beaccomplished by well-known techniques, readily available in theliterature. See, for example, “Immobilized Enzymes,” Ichiro Chibata,Halsted Press, New York (1978) and Cuatrecasas, J. Biol. Chem. 245:3059(1970). Whatever type of solid support is used, it must be treated so asto have bound to its surface either a receptor or ligand that directlyor indirectly binds the antigen. Typical receptors include antibodies,intrinsic factor, specifically reactive chemical agents such assulfhydryl groups that can react with a group on the antigen, and thelike. For example, avidin or streptavidin can be covalently bound tospherical glass beads of 0.5-1.5 mm and used to capture a biotinylatedantigen.

Signal producing system (“sps”) includes one or more components, atleast one component being a label, which generates a detectable signalthat relates to the amount of bound and/or unbound label, i.e. theamount of label bound or not bound to the compound being detected. Thelabel is any molecule that produces or can be induced to produce asignal, such as a fluorescer, enzyme, chemiluminescer, orphotosensitizer. Thus, the signal is detected and/or measured bydetecting enzyme activity, luminescence, or light absorbance.

Suitable labels include, by way of illustration and not limitation,enzymes such as alkaline phosphatase, glucose-6-phosphate dehydrogenase(“G6PDH”) and horseradish peroxidase; ribozyme; a substrate for areplicase such as Q-beta replicase; promoters; dyes; fluorescers such asfluorescein, isothiocyanate, rhodamine compounds, phycoerythrin,phycocyanin, allophycocyanin, o-phthaldehyde, and fluorescamine;chemiluminescers such as isoluminol; sensitizers; coenzymes; enzymesubstrates; photosensitizers; particles such as latex or carbonparticles; suspendable particles; metal sol; crystallite; liposomes;cells, etc., which can be further labeled with a dye, catalyst, or otherdetectable group. Suitable enzymes and coenzymes are disclosed in U.S.Pat. No. 4,275,149 to Litman, et al., columns 19-28, and U.S. Pat. No.4,318,980 to Boguslaski, et al., columns 10-14; suitable fluorescers andchemiluminescers are disclosed in U.S. Pat. No. 4,275,149 to Litman, etal., at columns 30 and 31; which are incorporated herein by reference.Preferably, at least one sps member is selected from the groupconsisting of fluorescers, enzymes, chemiluminescers, photosensitizers,and suspendable particles.

The label can directly produce a signal, and therefore, additionalcomponents are not required to produce a signal. Numerous organicmolecules, for example fluorescers, are able to absorb ultraviolet andvisible light, where the light absorption transfers energy to thesemolecules and elevates them to an excited energy state. This absorbedenergy is then dissipated by emission of light at a second wavelength.Other labels that directly produce a signal include radioactive isotopesand dyes.

Alternately, the label may need other components to produce a signal,and the sps can then include all the components required to produce ameasurable signal, which can include substrates, coenzymes, enhancers,additional enzymes, substances that react with enzymatic products,catalysts, activators, cofactors, inhibitors, scavengers, metal ions,specific binding substance required for binding of signal generatingsubstances, and the like. A detailed discussion of suitable signalproducing systems can be found in U.S. Pat. No. 5,185,243 to Ullman, etal., columns 11-13, which is incorporated herein by reference.

The label is bound to a specific binding pair (hereinafter “sbp”) memberwhich is the antigen, or is capable of directly or indirectly bindingthe antigen, or is a receptor for the antigen, and includes, withoutlimitation, the antigen; a ligand for a receptor bound to the antigen; areceptor for a ligand bound to the antigen; an antibody that binds theantigen; a receptor for an antibody that binds the antigen; a receptorfor a molecule conjugated to an antibody to the antigen; an antigensurrogate capable of binding a receptor for the antigen; a ligand thatbinds the antigen, etc. Binding of the label to the sbp member can beaccomplished by means of non-covalent bonding, for example, by formationof a complex of the label with an antibody to the label, or by means ofcovalent bonding, for example, by chemical reactions which result inreplacing a hydrogen atom of the label with a bond to the sbp member orcan include a linking group between the label and the sbp member. Suchmethods of conjugation are well known in the art. See for example, U.S.Pat. No. 3,817,837 to Rubenstein, et al., which is incorporated hereinby reference. Other sps members can also be bound covalently to sbpmembers. For example, in U.S. Pat. No. 3,996,345 to Ullman, et al., twosps members such as a fluorescer and quencher can be bound respectivelyto two sbp members that both bind the analyte, thus forming afluorescer-sbp₁:analyte:sbp₂-quencher complex. Formation of the complexbrings the fluorescer and quencher in close proximity, thus permittingthe quencher to interact with the fluorescer to produce a signal. Thisis a fluorescent excitation transfer immunoassay. Another concept isdescribed in, EP 0,515,194 A2 to Ullman, et al., which uses achemiluminescent compound and a photosensitizer as the sps members. Thisis referred to as a luminescent oxygen channeling immunoassay. Both theaforementioned references are incorporated herein by reference.

The method and markers of the present invention can be used to diagnosethe presence of a disease or a disease stage in a patient, as well as inother diagnostic methods as mentioned above. Specifically, a diagnosticdevice for use in detecting the presence of HNSCC in a patient isprovided and includes a detector device for detecting the presence of atleast one marker in the serum of the patient.

The detector includes, but is not limited to, an assay, a slide, afilter, a microarray, a macroarray, computer software implementing dataanalysis methods, and any combinations thereof. For example, thedetector can be an immunoassay such as ELISA. The detector can alsoinclude a two-color detection system or other detector system known tothose of skill in the art.

The detector also includes a panel of markers that are indicative of thepresence of disease. The panel of markers can include markers that areknown to those of skill in the art and markers determined utilizing themethodology disclosed herein. Examples of diseases that the markersdetect include, but are not limited to, gynecological sickness such asendometriosis, ovarian cancer, breast cancer, cervical cancer, HNSCC,and primary peritoneal carcinoma. The method can also be used toidentify overexpressed or mutated proteins in tumor cells. That suchproteins are mutated or overexpressed presumably is the basis for theimmune reaction to these proteins. Therefore, markers identified usingthese methods could provide markers for molecular pathology asdiagnostic or prognostic markers. In the present invention, the markersare preferably used to detect HNSCC. Preferably, the markers areselected from those listed in Table 5.

A method of making a panel of head and neck squamous cell carcinomamarkers is provided, including the steps of creating HNSCC cDNAlibraries, replicating the HNSCC cDNA libraries, performing differentialbiopanning, selecting clones to be arrayed on a protein microarray,immunoreacting the microarray against HNSCC patient serum, and selectinghighly reactive clones for placement on a panel. The method of makingthe detector and panel are further described in the examples below aswell.

This diagnostic device can be used to diagnose HNSCC by detectingmarkers in the serum of a patient indicative of the presence of HNSCC,and diagnosing the patient with HNSCC. As further described below, theserum of the patient is compared to the panel of markers in the detectordevice, and based on the reactivity of the serum with the panel, adiagnosis is made.

The present invention also provides for a diagnostic device for use instaging HNSCC in a patient including a detector device for detecting apresence of at least one marker indicative of stages of HNSCC, whereinthe detection device includes a panel of markers for HNSCC. The detectoris essentially the same as described above; however, the detector iscreated to determine the stage of HNSCC. Relevant markers are selectedaccording to each stage of HNSCC desired in order to make the panel ofHNSCC markers. This panel can then be used in the diagnostic device totest a patient's serum in order to determine the stage of their HNSCC.Knowing the stage of HNSCC can aid in selecting appropriate treatments.

The present invention also provides for a more general method of stagingcancer, by detecting RNA or protein levels of markers that areoverexpressed or altered due to mutation in the serum of a patientindicative of a stage of cancer with the diagnostic device, anddetermining the stage of the cancer.

A method of personalized immunotherapy is provided, including the stepsof detecting markers in the serum of a patient with the diagnosticdevice as described above, analyzing reactivity of markers in the serumto markers in the panel of the diagnostic device, identifying markers inthe serum with the highest reactivity, and using the markers identifiedas immunotherapeutic agents personalized to the immunoprofile of thepatient. Thus, the immunotherapy is targeted to a person's immunoprofilebased on the panel of markers.

According to the present invention, the immunotherapeutic agents arepreferably immunotherapeutic agents for HNSCC; however, theimmunotherapeutic agents can also be directed to other diseases such as,but not limited to, those mentioned above. For personalizedimmunotherapy, the reactivity to particular epitope clones can becorrelated using sera from patients having cancer. Using a comprehensivepanel of epitope markers that can accurately detect early stage HNSCC,one can utilize these antigens as immunotherapeutic agents personalizedto the immunoprofile of each patient. When T-cells from the patientrecognize antigen biomarkers, the T-cells are stimulated, activated andtherefore produce an immune-response. Such reactivity demonstrates thepotential of each antigen as a component of a vaccine to induce a Tcell-mediated immune response essential for generation of cancervaccines. Individuals scoring positive in the presymptomatic testing forHNSCC are then offered an anti-cancer vaccine tailored to theirimmunoprofile against a panel of tumor antigens.

Like the method of personalize immunotherapy, the present invention alsoprovides for a method of personalized targeted therapy by detectingmarkers that are overexpressed or altered due to mutation in the serumof a patient with the diagnostic device as described above, analyzingreactivity of markers in the serum to markers in the panel of thediagnostic device, identifying markers in the serum with the highestreactivity, and using the markers identified as therapeutic targetspersonalized to the patient. In other words, this method is directed todifferent types of therapy and not just immunotherapy.

Treatment of a patient can be altered based upon the markers detected.For example, the treatment can be specifically designed based upon themarkers identified. In other words, the therapy can be altered to mostsuitably treat the identified markers, such that the treatment isdesigned to most efficiently treat the identified marker. Thus, thetreatment is personalized according to the patient's needs. The abilityto adjust the therapy enables the treatment to be tailored to the needsof the person being treated. The treatments that can be used range fromvaccines to chemotherapy.

The present invention therefore also provides for a method of making apersonalized anti-cancer vaccine, including the steps of detectingmarkers in the serum of a patient with the diagnostic device asdescribed above, analyzing reactivity of markers in the serum to markersin the panel of the diagnostic device, identifying markers in the serumwith the highest reactivity, and formulating an anti-cancer vaccineusing the identified markers. Preferably, the anti-cancer vaccine is ananti-HNSCC vaccine.

As further described in the examples below, a method is also provided ofpredicting a clinical outcome in a HNSCC patient, including the steps ofanalyzing a pattern of reactivity of a patient's serum with a panel ofHNSCC markers, and predicting a clinical outcome. The detector deviceincluding the panel as described above can be used to analyze thepatient's serum. Further, the clinical outcome predicted includes, butis not limited to, a response to a particular therapeutic interventionor chemotherapeutic drug, survival, and development of neck or distantmetastasis.

Explanations as well as examples are provided below detailing themethods and devices of the present invention.

The analysis of mRNA expression in tumors does not necessarily revealthe status of protein levels in the cancer cells. Other factors such asprotein half-life and mutation can be altered without an effect on mRNAlevels thus masking significant molecular changes at the protein level.Serum antibody reactivity to cellular proteins occurs in cancer patientsdue to presentation of mutated forms of proteins from the tumor cells oroverexpression of proteins in the tumor cells. The host immune systemcan direct individuals to molecular events critical to the genesis ofthe disease. Using a candidate gene approach, experience has shown thatthe frequency of serum positivity to any single protein is low.Therefore, to increase the identification of such autoantigens, a moreglobal approach is employed to exploit immunoreactivity to identifylarge numbers of cDNAs coding for proteins that are mutated orupregulated in cancer cells.

In order to develop an effective screening test for early detection ofHNSCC cancer, cDNA phage display libraries are used to isolate cDNAscoding for epitopes reacting with antibodies present specifically in thesera of patients with HNSCC cancer. The methods of the present inventiondetect various antibodies that are produced by patients in reaction toproteins overexpressed in their HNSCC tumors. This is achievable bydifferential biopanning technology using human sera collected both fromnormal individuals and patients having HNSCC cancer and phage displaylibraries expressing cDNAs of genes expressed in HNSCC tumors and celllines. Serum reactivity toward a cellular protein can occur because ofthe presentation to the immune system of a mutated form of the proteinfrom the tumor cells or overexpression of the protein in the tumorcells. The strategy provides for the identification of epitope-bearingphage clones (phagotopes) displaying reactivity with antibodies presentin sera of patients having HNSCC cancer but not in control sera fromunaffected individuals. This strategy leads to the identification ofnovel disease-related epitopes for diseases including, but not limitedto, HNSCC cancer, that have prognostic/diagnostic value with additionalpotential for therapeutic vaccines and medical imaging reagents. Thisalso creates a database that can be used to determine both the presenceof disease and the stage of the disease.

The series of experiments disclosed herein provide direct evidence thatbiopanning a T7 coat protein fusion library can isolate epitopes forantibodies present in polyclonal sera. This also showed that thetechnology can be applied to direct microarray screening of largenumbers of selected phage against numerous patient and control sera.This approach provides a large number of biomarkers for early detectionof disease.

More specifically, the methods of the present invention provide four tofive cycles of affinity selection and biopanning which are carried outwith biological amplification of the phage after each biopanning,meaning growth of the biological vector of the cDNA expression clone ina biological host. Examples of biological amplification include, but arenot limited to, growth of a lytic or lysogenic bacteriophage in hostbacteria or transformation of bacterial host with selected DNA of thecDNA expression vector. The number of biopanning cycles generallydetermines the extent of the enrichment for phage that binds to the seraof patient with HNSCC cancer. This strategy allows for one cycle ofbiopanning to be performed in a single day. Someone skilled in the artcan establish different schedules of biopanning that provide the sameessential features of the procedure described above.

Two biopanning experiments are performed with each librarydifferentially selecting clones between control and disease patientsera. The first selection is to isolate phagotope clones that do notbind to control sera pooled from control individuals but do bind to apool of disease patient serum. This set of phagotope clones representepitopes that are indicative of the presence of disease as recognized bythe host immune system. The second type of screening is performed toisolate phagotope clones that did not bind to a pool of control sera butdo bind to an individual patient's serum. Those sets of phagotope clonesrepresent epitopes that are indicative of the presence of disease.

Subsequent to the biopanning, the clones so isolated can be used tocontact antibodies in sera by spotting the clones or peptide sequencesof amino acids containing those encoded by the clones. After spotting ona solid support, the arrays are rinsed briefly in a 1% BSA/PBS to removeunbound phage, then transferred immediately to a 1% BSA/PBS blockingsolution and allowed to sit for one hour at room temperature. The excessBSA is rinsed off from the slides using PBS. This step insures that theelution step of antibodies is more effective. The use of PBS elutes allof the antibodies without harming the binding of the antibody. Antibodydetection of reaction with the clones or peptides on the array iscarried out by labeling of the serum antibodies or through the use of alabeled secondary antibody that reacts with the patient's antibodies. Asecond control reaction to every spot allows for greater accuracy of thequantitation of reactivity and increases sensitivity of detection.

The slides are subsequently processed to quantify the reaction of eachphagotopes. Such processing is specific to the label used. For instance,if fluorophore cy3-cy5 labels are used, this processing is done in alaser scanner that captures an image of the slide for each fluorophoreused. Subsequent image processing familiar to those skilled in the artcan provide intensity values for each phagotope. The data analysis canbe divided into the following steps:

-   -   1. Pre-processing and normalization.    -   2. Identifying the most informative markers    -   3. Building a predictor for molecular diagnosis of HNSCC cancer        and validating the results.

The purpose of the first step is to cleanse the data from artifacts andprepare it for the subsequent steps. Such artifacts are usuallyintroduced in the laboratory and include: slide contamination,differential dye incorporation, scanning and image processing problems(e.g. different average intensities from one slide to another),imperfect spots due to imperfect arraying, washing, drying, etc. Thepurpose of the second step is to select the most informative phages thatcan be used for diagnostic purposes. The purpose of the third step is touse a software classifier able to diagnose cancer based on the antibodyreactivity values of the selected phages. The last step also includesthe validation of this classifier and the assessment of its performanceusing various measures such as specificity, sensitivity, positivepredictive value and negative predictive value. The computation of suchmeasures can be done on cases not used during the design of the chip inorder to assess the real-world performance of the diagnosis toolobtained.

The pre-processing and normalization step is performed for arrays usingtwo channels such as Cy5 for the human IgG and Cy3 for the T7 control,the spots are segmented, and the mean intensity is calculated for eachspot. A mean intensity value is calculated for the background, as well.A background corrected value is calculated by subtracting the backgroundfrom the signal. If necessary, non-linear dye effects can be eliminatedby performing an exponential normalization (Houts, 2000) and/or LOESSnormalization of the data and/or a piecewise linear normalization. Thevalues obtained from each channel are subsequently divided by their meanof the intensities over the whole array. Subsequently, the ratio betweenthe IgG and the T7 channels was calculated. The values coming fromreplicate spots (spots printed in quadruplicates) are combined bycalculating mean and standard deviation. Outliers (outside+/−twostandard deviations) are flagged for manual inspection). Single channelarrays are pre-processed in a similar way but without taking the ratios.This preprocessing sequence was shown to provide good results for allpreliminary data analyzed.

The step of selecting the most informative markers is used to identifythe most informative phages out of the large set of phages started with.The better the selection, the better is the expected accuracy of thediagnosis tool.

A first test is necessary to determine whether a specific epitope issuitable for inclusion in the final set to be spotted. The selectionmethods to be applied follow the principles of the methods successfullyapplied in (Golub et al., 1999; Alizadeh et al., 2000) and can bebriefly described in the following.

Procedure 1

The procedure is initiated defining a template for the cancer case.Unlike gene expression experiments where the expression level of a genecan be either up or down in cancer vs. healthy subjects, here thepresence of antibodies specific to cancer are tested for. Therefore,epitopes with high reactivity in controls and low reactivity in patientsare not expected and the profile is sufficient. Each epitope can have aprofile across the given set of patients. The profile of each epitope iscompared with the templates using a correlation-based distance. Thoseskilled in the art can recognize that other distances can be usedwithout essentially changing the procedure.

The epitopes are then prioritized based on the similarity between thereference profile and their actual profile. For example, as detailed inU.S. Ser. No. 11/060,867 (incorporated herein by reference), 46 epitopeswere found to be informative for a correlation threshold of 0.8. Thefinal cutoff threshold is calculated by doing 1000 random permutationsonce the whole data set become available. Each such permutation movesrandomly the subjects between the ‘patient’ and ‘control’ categories.Calculating the score of each epitope profile for such permutationsallows for the establishment of a suitable threshold for the similarity(Golub et. al. 1999).

The technique follows closely the one used in (Golub, 1999). However,the technique can be further improved as follows. Firstly, thistechnique was shown to provide good results if most controls areconsistent by providing the same type of reactivity. However,preliminary data showed that there are control subjects that show anon-specific reactivity with all clones, while still clearly differentfrom patients. Such control subjects with a high non-specific reactionintroduces spikes in the clone profile in the area corresponding to thecontrol subjects. These spikes decrease the score of the relevant clonesmaking them more difficult to distinguish from the irrelevant ones. Inorder to reduce this effect, all control subjects with a non-specificresponse (i.e. a unimodal distribution) were eliminated from theanalysis leading to the epitope selection.

A second essential modification is related to the set of epitopesselected. There are rare patients who might react only to a small numberof very specific epitopes. If the selection of the epitopes is done onstatistical grounds alone, such very specific epitopes can be missed ifthe set of patients available contains only few such rare patients. Inorder to maximize the sensitivity of the penultimate test resulted fromthis work, every effort was made to include epitopes which might be theonly ones reacting to rare patients. In order to do this, theinformation content of the set of epitopes is maximized while trying tominimize the number of epitopes used using the following procedure.

Procedure 2

It is assumed that there are m patients and k controls. n randompatients are selected from the m available. For each of the n patientsused for epitope selection, amplification is performed (n×4 biopannings)as well as self-reactions. Those patients/epitopes that do not react tothemselves are eliminated.

A chip is made with all available, self-reacting epitopes printed inquadruplicates. This chip is reacted with all patients and controls (n+kantibody reactions). Controls are eliminated with a non-specificreactivity. For the set of epitopes coming from a single patient,Procedure 1 is applied to order the epitopes in the order of theirinformational content and the ones that can be used to differentiatepatients from controls are selected.

The epitopes are ordered by their reactivity in decreasing order of thenumber of patients they react to. This list is scanned from the topdown, and epitopes are moved from this list to the final set. Every timea set of epitopes from a patient x is added to the final set, thepatient x and all other patients that these epitopes react to arerepresented in the current set of epitopes. This is repeated until allpatients are represented in the current set of epitopes.

This procedure minimizes the number of epitopes used while maximizingthe number of patients that react to the chip containing the selectedepitopes.

The following example shows how this procedure works using a simpleexample. The matrix in FIG. 1 contains a row i for the clones comingfrom patient i and a column j for the serum coming from patient j. Aserum is said to react specifically with a set of clones if thehistogram of the ratios is bimodal. A serum is said to reactnon-specifically if the histogram of the ratio is unimodal. Furthermore,a serum might not react at all with a set of clones. If the serum frompatient j reacts specifically with the clones from patient i, the matrixcan contain a value of 1 at the position (i, j). The element at position(i, j) is left blank if the there is no reaction or the reaction isnon-specific.

Each set of epitopes corresponding to a row of the matrix is pruned bysub-selecting epitopes according to Procedure 1. The rows are now sortedin decreasing reactivity (number of patients other than self that theclones react to). For instance, in FIG. 2, the clones from patient 2react with sera from self (column 2) and patients 4 and 8. The clonesfrom patient 3 react with sera from self (column 3) and patients 6 and10, etc. The final set of clones was obtained from patients 2, 3, 5, 7and 1 (reading top-down in column 1). Clones coming from patients 8, 9and 10 are not included since these patients already react to clonescoming from other patients. This set ensures that the chip made withthese clones reacts with all patients in this example.

Procedure 3

Arrays using two channels such as Cy5 for the human IgG and Cy3 for theT7 control are processed as follows. The spots are segmented and themean intensity is calculated for each spot. A mean intensity value iscalculated for the background, as well. A background corrected value iscalculated by subtracting the background from the signal. The valuescoming from each channel are normalized by dividing by their mean.Subsequently, the ratio between the IgG and the T7 channels arecalculated and a logarithmic function is applied. The values coming fromreplicate spots (spots printed in quadruplicates) are combined bycalculating mean and standard deviation. Outliers (outside+/−twostandard deviations) are flagged for manual inspection. Someone skilledin the art can recognize that various combinations and permutations ofthe steps above or similar could replace the normalization procedureabove without substantially changing rest of the data analysis process.Such similar steps include without limitation taking the median insteadof the mean, using logarithmic functions in various bases, etc.

The histogram of the average log ratio is calculated. If the histogramis unimodal there is no specific response. If the histogram is clearlybimodal, there is a specific response. All subjects analyzed fell in oneof these two categories or had no response at all. A mixed probabilitymodel is used in less clear cases to fit two normal distributions as in(Lee, 2000). If the two distributions found under the maximum likelihoodassumption are separated by a distance d of more than 2 standarddeviations (corresponding to a p-value of approximately 0.05), there isa specific response. If the distance is less than 2 standard deviations,the response can be considered as not specific. The preliminary dataanalyzed so far showed a very good separation of the distributions forthe patients.

Once the chosen clones are spotted on the final version of the array, anumber of sera coming from both patients and controls can be tested.These sera come from subjects not used in any of the phases that lead tothe fabrication of the array (i.e. not involved in clone selection, notused as controls, etc.). Each test was evaluated using Procedure 3above. The performance on this validation data can be reported in termsof PPV, NPV, specificity and sensitivity. Since these performanceindicators are calculated on data not previously used, they provide agood indication of the performance of the test for screening purposesfor the various categories of patients envisaged in the generalpopulation.

The present invention also provides a kit including all of thetechnology for performing the above analysis. This is included in acontainer of a size sufficient to hold all of the required pieces foranalyzing sera, as well as a digital medium such as a floppy disk orCDROM containing the software necessary to interpret the results of theanalysis. These components include the array of clones or peptidesspotted onto a solid support, prewashing buffers, a detection reagentfor identifying reactivity of the patients' serum antibodies to thespotted clones or peptides, post-reaction washing buffers, primary andsecondary antibodies to quantify reactivity of the patients' serumantibodies with the spotted array and methods to analyze the reactivityso as to establish an interpretation of the serum reactivity.

A biochip, otherwise known as a biosensor, for detecting the presence ofthe disease state in a patient's sera is provided by the presentinvention. The biochip has a detector contained within the biochip fordetecting antibodies in a patient's sera. This allows a patient's serato be tested for the presence of a multitude of diseases or reaction todisease markers using a single sample and the analysis can be conductedand analyzed on a single chip. By utilizing such a chip, the timerequired for the detection of disease is lowered while also enabling adoctor to determine the level of disease spread or infection. The chip,or other informatics system can be altered to weigh the results. Inother words, the informatics can be altered to adjust the levels ofsensitivity and/or specificity of the chip.

The above discussion provides a factual basis for the use of thecombination of markers and method of making the combination. The methodsused with a utility of the present invention can be shown by thefollowing non-limiting examples and accompanying figures.

Example 1

Combining phage display technology with protein microarray technology,5,133 selectively cloned tumor antigens were screened and ranked using afeature selection method based on receiver-operator characteristiccurves for neural network classifiers. A model was built using thetop-ranked 40 biomarkers. The entire modeling strategy, both featureselection and model development, was validated by bootstrapping on anindependent set of 80 cancer and 78 control sera. Estimated accuracy ofthis modeling strategy was 82.9% (95% CI 77.2-87.9) with sensitivity of83.2% (95% CI 74.0-91.6) and specificity of 82.7% (95% CI 74.5-93.6).The accuracy of this novel diagnostic platform represents a significantimprovement over current diagnostic accuracy of 37% to 56% in theprimary care setting. Further, the diagnostic test described can beeasily translated into assays such as enzyme linked immunosorbent assaythat is already widely available in clinical practice and familiar toclinical laboratories. This facilitates wide-spread application of thisassay as a simple tool to screen (in asymptomatic patients) and diagnose(in symptomatic patients) HNSCC in high risk population.

Methods

HNSCC and control patients were recruited from the otolaryngology clinicpopulation.

Construction of T7 phage display cDNA libraries. HNSCC specimens wereobtained at the time of surgical extirpation and poly-A RNA extractedand purified. The construction of T7 phage cDNA display libraries wasperformed using Novagen's OrientExpress cDNA Synthesis (Random primersystem) and Cloning System as per manufacturer's suggestions.

Differential biopanning of HNSCC phage display cDNA libraries.Differential biopannings using sera from control and HNSCC patients wereperformed as per the manufacturer's protocol (T7Select System, TB178).

Protein microarray immunoreaction. Individual clones were picked andarrayed in replicates of 6 onto FAST™ slides (Schleicher & Schuell)using a robotic microarrayer Prosys 5510TL (Cartesian Technologies).

Microarray data analysis. Following immunoreaction, the microarrays werescanned with an Axon Laboratories 4100A scanner (Axon Laboratories)using 635 nm and 532 nm lasers to produce a red (AlexaFluor647) andgreen (AlexaFluor532) composite image. Using the ImaGene™ 6.0(Biodiscovery) image analysis software, the binding of each of thecancer-specific peptides with IgGs in each serum was then analyzed andexpressed as a ratio of red to green fluorescent intensities. Themicroarray data were further processed by a sequence of transformationsincluding background correction, omission of poor quality spots,log-transformation, normalization by subtracting the global median (inlog scale), then combining of 6 spot replicates to yield a mean valuefor each marker.

Sequencing of phage cDNA clones. Individual phage clones were PCRamplified using forward primer 5′GTTCTATCCGCAACGTTATGG3′ (SEQ ID NO: 1)and reverse primer 5′GGAGGAAAGTCGTTTTTTGGGG3′ (SEQ ID NO: 2) andsequenced using forward primer by Wayne State University Sequencing CoreFacility.

Serum samples. Blood samples from HNSCC patients (Stages I-IV) andcontrol patients were obtained after informed consent. Both HNSCC andcontrol patients were recruited from the otolaryngology clinicpopulation. All enrolled HNSCC patients have cancer confirmed onpathology. Control patients underwent thorough head and neck examinationand imaging to rule out the presence of cancer after they initiallypresented with nonspecific head and neck symptoms such as sore throat,hoarseness, dysphagia, coughing, choking and gasping, neck mass, andotalgia. Blood samples were centrifuged at 2,500 rpm at 4° C. for 15minutes and supernatants were stored at −70° C.

Construction of T7 phage display cDNA libraries. Head and neck cancerspecimens were obtained at the time of surgical extirpation andimmediately placed in RNA later solution (Ambion). Total RNA extractionwas performed using TRIZOL reagent (Invitrogen Corporation) per themanufacturer's protocol. After extraction, poly-A RNAs were purifiedtwice using Straight A mRNA Isolation System (EMD Biosciences) perprotocols from the manufacturer. The construction of T7 phage cDNAdisplay libraries was performed using Novagen's OrientExpress cDNASynthesis (Random primer system) and Cloning System as per themanufacturer's suggestions (Novagen)

Differential biopanning of HNSCC phage display cDNA libraries.Differential biopannings using sera from normal healthy patients andHNSCC patients were performed as per manufacturer's protocol (Novagen:T7Select System, TB178). Protein G Plus-agarose beads were used forserum immunoglobulins (IgGs) immobilization. Three to five rounds ofbiopanning were performed using serum from each of the 12 HNSCCpatients. Each cycle of biopanning consisted of passing the entire phagelibrary through protein-G beads coated with IgGs from pooled sera ofhealthy controls, passage through beads coated with IgGs from individualserum from HNSCC patients, followed by final elution of bound phageclones from the beads.

Protein microarray immunoreaction. Individual clones were picked andarrayed in replicates of six onto FAST™ slides (Schleicher & Schuell)using a robotic microarrayer Prosys 5510TL (Cartesian Technologies) with32 Micro-Spofting Pins (TeleChem). Protein microarrays were blocked with4% milk in 1×PBS for one hour at room temperature followed by anotherhour of incubation with primary antibodies consisting of human serum ata dilution of 1:300 in PBS, mouse anti-T7 capsid antibodies (0.15 μg/ml)(EMD Biosciences, Madison, Wis.), and BL21 E. coli cell lysates (5μg/ml). The microarrays were then washed three times in PBS/0.1%Tween-20 solution four minutes each at room temperature and thenincubated with AlexaFluor647 (red fluorescent dye)-labeled goatanti-human IgG antibodies (1 μg/ml) and AlexaFluor532 (green fluorescentdye)-labeled goat anti-mouse IgG antibodies (0.05 μg/ml) (MolecularProbes, Eugene, Oreg.) for one hour in the dark. Finally, themicroarrays were washed three times in PBS/0.1% Tween-20 four minuteseach, then twice in PBS for two minutes each and air dried.

Results

Differential biopanning results in enrichment of T7 phage HNSCC cDNAdisplay libraries. T7 phage cDNA display libraries were constructed fromthree HNSCC specimens (floor of mouth, base of tongue, and larynx). Theinsertion of foreign HNSCC cDNAs into the T7 phage capsid genes resultsin the production of fusion capsid proteins. Foreign peptides displayedin this fashion have been shown to fold in their native conformations,thus exposing both linear and conformational antigens on the surface ofthe bacteriophage where they are accessible for selection and analysis.

A potential limitation of the T7 phage display system, however, is theabsence of post-translational modifications such as glycosylation,sulfation, and phosphorylation which can influence the folding andbinding of these peptides. Each of these three cDNA phage libraries wasfound to contain between 10⁶ and 10⁷ primary recombinants. Since themajority of the clones in the HNSCC cDNA libraries carried normalself-proteins, differential biopanning was performed in order to enrichthe cDNA libraries with clones expressing the HNSCC-specific peptides(FIG. 3). The technique relied on specific antigen-antibody reactions toremove clones binding to immunoglobulins present in control sera fromnon-cancer patients while retaining clones with peptides of interest(HNSCC-specific antigens) binding to antibodies in HNSCC sera that serveas the bait. In order to increase the diversity of HNSCC-specificpeptides, the cDNA libraries were biopanned against sera from 12 HNSCCpatients with tumors representing different subsites of head and neck(Table 1a and 1b).

FIG. 3 depicts a schema showing the process of combining phage-displaytechnology, protein microarrays, and bioinformatics tools to profile andselect a panel of 40 unique clones from an initial 107 clones in thethree HNSCC cDNA phage display libraries. Three cDNA libraries wereconstructed from HNSCC specimens. Because the majority of the clones inthe HNSCC cDNA libraries carried normal self-proteins, subtractivebiopanning was performed in order to enrich the cDNA libraries withclones expressing the HNSCC-specific peptides. The technique relied onspecific antigen-antibody reactions to remove clones binding toimmunoglobulins (IgGs) from control sera while retain clones withpeptides of interest (HNSCC-specific antigens) using antibodies in HNSCCsera as bait. Protein G Plus-agarose beads were used for serum IgGsimmobilization. Three to five rounds of biopanning were performed usingserum from each of the 12 HNSCC patients. Each cycle of biopanningconsisted of passing the entire phage library through protein-G beadscoated with IgGs from pooled sera of healthy controls, passage throughbeads coated with IgGs from individual serum from HNSCC patients,followed by final elution of bound phage clones from the column.Following biopanning, a total of 5,133 clones were randomly chosen fromthe 12 highly enriched pools of T7 phage cDNA libraries. These cloneswere arrayed and immunoreacted against serum samples from 39 HNSCC and41 control patients. The binding of arrayed HNSCC-specific peptides withantibodies in sera was quantified with the AlexaFluor647(red-fluorescent dye)-labeled anti-human antibody. Any small variationin the amount of phage particles spotted throughout the microarray wasquantified by measuring the amount of mouse anti-T7 capsid antibodiesbound using AlexaFluor532 (green fluorescent dye)-labeled goatanti-mouse IgG antibody. Following immunoreaction, the microarray datawas analyzed and processed by a sequence of transformations. To reducethe number of clones for further analysis, one-tailed t-test was used toselect 1,021 clones (from the original 5,133 clones) with increasedreactivity to cancer sera compared to control sera (p<0.1). Sera from 80cancer and 78 non-cancer controls, not previously used for biopanning orselection of clones, were immunoreacted against the previously selected1,021 HNSCC-specific peptides. The reactivity of each of the 1,021cancer-specific peptides with each of these 158 sera was then analyzed.Following data normalization and transformation, the top 100 clonesbased on one-tailed t-test were selected. Subsequently these 100 cloneswere re-ranked based on their performance in a neural network model. Theperformance measure used was the area under the curve (AUC) taken as anaverage over 200 bootstrap trials. These clones were then sequenced andanalyzed for homology to mRNA and genomic entries in the GenBankdatabases using BLASTn. A classifier, based on a three-layerfeed-forward neural network, was then built using the top uniquebiomarkers. Several models were created using the increasing number offeatures from the top ranked clones. An input set size of 40 clones wasfound to be a good compromise between the performance and complexity ofthe classifier.

High throughput protein microarray immunoscreening for selection ofinformative HNSCC-specific biomarkers. A total of 5,133 clones wererandomly chosen from the 12 highly enriched pools of T7 phage clonesfrom these cDNA libraries. These clones were arrayed and immunoreactedagainst serum samples from 39 HNSCC patients (Table 2a) and 41 controls(Table 2b). The binding of arrayed HNSCC-specific peptides withantibodies in sera was quantified with the AlexaFluor647(red-fluorescent dye)-labeled anti-human antibody. The amount of phageparticles spotted throughout the microarray was quantified by measuringthe amount of phage at each spot using a mouse monoclonal antibody tothe T7 capsid protein quantified using AlexaFluor532 (green fluorescentdye)-labeled goat anti-mouse IgG antibody. The ratio of AlexFluor647intensity over AlexaFluor532 intensity was then calculated in order toaccount for any small variation in the serum antibody binding toantigens due to different amounts of phage particles spotted on themicroarray (FIG. 4).

FIG. 4 shows phage clones that were spotted in replication of six in anordered array onto FAST™ nitrocellulose coated glass slides. Thereactivity of each clone with each serum sample was analyzed andexpressed as a ratio of red fluorescent intensity to green fluorescentintensity. The binding of arrayed HNSCC-specific peptides withantibodies in sera was detected with the AlexaFluor647 (red-fluorescentdye)-labeled anti-human antibody. The use of mouse anti-T7 capsidantibodies, detected with the use of AlexaFluor532 (green fluorescentdye)-labeled goat anti-mouse IgG antibody, was necessary in order tonormalize for any small variation in the amount of phage particlesspotted throughout the microarray chip.

Following immunoreaction, the microarray data was processed by asequence of transformations and then analyzed. To reduce the number ofclones for further analysis, one-tailed t-tests under the R environment(v2.3.0) were used to select clones with increased binding toimmunoglobulins present in cancer sera compared to control sera usingthe criterion of p<0.10; 1021 clones met the criterion. In general, thevisually positive clones (yellow or orange spots) that reacted withcancer sera but not control sera (FIG. 5) corresponded to those forwhich the t-test was statistically significant.

FIG. 5 shows the protein microarrays immunoreacted against a serumsample from HNSCC patient (left panel) and serum from control patient(right panel). The visually positive clones are represented by yellow ororange spots in replication of six. These visually positive clonescorresponded to the statistically significant clones selected based onthe t-test.

Ranking of top clones based on protein microarray immunoreaction. Serafrom 80 HNSCC patients and 78 controls, not previously used forbiopanning or selection of clones, were immunoreacted against thepreviously selected 1,021 HNSCC-specific peptides. Of the 80 HNSCCpatients, 18 had early stage (I and II) and 62 had advanced stage (IIIand IV) disease. The distribution of early and late stage diseasereflects the distribution of HNSCC in clinical practice. HNSCC fromalmost all subsites of head and neck were represented (Table 3a). Casesand controls are similar in age, race, and gender (Table 3b). Thereactivity of each of the 1,021 cancer-specific peptides with each ofthese 158 sera was then analyzed. Following data normalization andtransformation, clones were ranked on the basis of the attainedsignificance levels from the one-tailed t-tests and the top 100 cloneswere selected. Subsequently these 100 clones were re-ranked based ontheir performance in a neural network model using area under the ROCcurve (AUC) averaged over 200 bootstrap trials. These 100 clones werethen sequenced and analyzed for homology to mRNA and genomic entries inthe GenBank databases using BLASTn. The predicted amino acids were alsodetermined in-frame with the T7 gene 10 capsid protein. A list of uniqueclones was generated by eliminating duplicate clones as well as thoseclones containing truncated peptides with fewer than 5 amino acids.

Validation of modeling strategy. A classifier, based on a three-layerfeed-forward neural network, was then built using the top uniquebiomarkers, using the nnet package under the R environment (v2.3.0).Several models were created using increasing numbers of features fromthe top ranked clones. An input set size of 40 clones was found to be agood compromise between the performance and complexity of theclassifier. Special attention was paid to avoid data overfitting byusing a reduced number of hidden nodes (n=4) as well as using a trainingmethod (Broyden-Fletcher-Goldfarb-Shanno) that included regularization,as implemented in the nnet package. The performance of the classifierwas estimated by averaging over 100 bootstrap samples from the originaldata set (80 cancer and 78 control sera) to obtain an accuracy of 82.9%(95% CI 77.2-87.9) with sensitivity of 83.2% (95% CI 74.0-91.6) andspecificity of 82.7% (95% CI 74.5-93.6) (FIG. 6). The classifier wasable to detect early stage HNSCC at least as well as late stage cancers.In fact, the accuracy on early stage cancers (86.7%) was slightly betterthan the accuracy on late stage cancers (82.9%). The performance of thisclassifier in detecting cancer from different subsites of head and neckregion was 85% (glottis), 86.2% (supraglottis), 90% (hypopharynx), 83.5%(oropharynx), 95% (nasopharynx), and 78.6% (oral cavity).

Of the top 40 unique clones, there were five clones that contained knowngene products in the reading frame of the T7 gene 10 capsid protein.These included ubiquinone-binding protein, NADH dehydrogenase subunit 1,pp 21 homolog (transcription elongation factor A (SII)-like 7), multiplemyeloma overexpression gene 2, and C10 protein (Table 4). The remaining35 clones contained peptides that were different from the originalproteins coded by the inserted gene fragments. This occurred because theinserted gene fragments were out of frame with the open reading frame ofthe T7 10B gene (n=14) (Table 5a), represented untranslated region ofknown genes (n=11) (Table 5b), or contained sequences from unknown genes(n=10) (Table 5c). It is likely that the recombinant gene products ofthese clones mimic some other natural antigens, and hence can be termedmimotopes. BLASTp search of the SWISSPROT database for homology to eachin-frame mimotope revealed that many of these gene products mimic knowncancer proteins and as such represent putative tumor antigens.

FIG. 6 shows the schema used to calculate the real accuracy fromapparent accuracy and optimism. A classifier (C), based on the top 40clones, was trained and tested using 80 cancer and 78 control samples.The resulting accuracy (96.2%) is called the apparent accuracy and is anoptimistic estimate of the true accuracy. Estimation of the real-worldperformance of the classifier methodology (feature selection and modeltraining) was evaluated by averaging over 100 bootstrap samples. Eachbootstrap run involved resampling by drawing from the original datasetwith replacement. For every bootstrap sample, a set of 40 features wasselected which performed best on this new arrangement of the data and aclassifier (C′) was trained based on that data and these features. Theclassifier C′ was subsequently applied to both the original data andbootstrap datasets. The difference between the accuracies on these twodata sets provided a measure of the “optimism” embedded in the apparentaccuracy. The expected real-world performance (82.9%) was obtained bysubtracting the mean values of the optimism (13.3%) from the apparentaccuracy parameter (96.2%). The confidence intervals of the estimatedperformance indices were determined from the empirical distribution ofthe optimism value.

Evaluation of neural network classifier performance using 10-fold crossvalidation. Sera from 80 HNSCC patients and 78 controls, not previouslyused for biopanning or selection of clones, were immunoreacted againstthe previously selected 1,021 HNSCC-specific peptides. Of the 80 HNSCCpatients, 18 had early stage (I and II) and 62 had advanced stage (IIIand IV) disease, reflecting the distribution of HNSCC in clinicalpractice. HNSCC from almost all subsites of head and neck wererepresented (Table 3a). In order to reflect the target screeningpopulation, control sera used were taken from patients who presentedwith symptoms or exams similar to that of head and neck cancer patients.Many of these control patients also have history of moderate toexcessive tobacco and/or alcohol use. Cases and controls are similar inage, race, and gender (Table 3b).

A ten-fold cross-validation procedure was used to assess the performanceof neural network model for cancer classification based of patterns ofserum immunoreactivity against a panel of biomarkers. In this scheme,the data was split into 10 equal parts, balanced with respect to thecancer and control groups. Both the clone (feature) selection and modeltraining were based on 9/10^(th) of the data and the model was tested onthe remaining 1/10^(th) of the dataset. Feature selection based ontraining data (142 samples) included several steps. First, clones thatimmunoreacted, on average, less with sera from cancer patients thancontrols were discarded. The remaining clones were then ranked using thep-value from a t-test and the top 250 clones were used individually toderive a ROC curve. Finally, the top 130 clones ranked in decreasingorder of area under the ROC curve (AUC) were retained to build aclassification model, based on a three-layer feed-forward neuralnetwork, using the nnet package³² under the R environment (v2.3.0).Special attention was paid to avoid data overfitting by using a reducednumber of hidden nodes (n=5) as well as using a training method(Broyden-Fletcher-Goldfarb-Shanno) which included regularization, asimplemented in the nnet package. The resulting classifier was thentested against the independent test set of 16 samples. The entire10-fold cross-validation was repeated 100 times in order to minimize anypotential bias due to random partition of training and test sets (FIG.2).

The classification method yields an average accuracy of 74.6% (95% CI,52.5% to 96.7%) with AUC of 82.3%, sensitivity of 73.1%, and specificityof 76.1%. Notably this classifier was able to detect early stage HNSCC(72.8%) at least as well as late stage cancers (73.2%). The sensitivityof this classifier in detecting cancer from different subsites of headand neck region was 73.9% (glottis), 72.6% (supraglottis), 83.7%(hypopharynx), 74.9% (oropharynx), 87.5% (nasopharynx), 67.3% (oralcavity), and 60% (unknown primary). To further verify that there is atrue link between the immunoreaction level of the different clones andthe class membership of the samples (cancer vs. healthy), the classidentifiers were randomly permuted among the patients and recalculatedthe cross-validation procedure with the permuted data. As expected, theestimate of accuracy and AUC obtained in these permuted cases werearound 50% and are statistically significantly different (p<1e−15) fromthe accuracy (74.6%) and AUC (82.3%) obtained using the actual classidentifiers (FIG. 3).

Characterization of the panel of 130 biomarker panel. Clones were rankedbased on the number of times they were selected and used in the 130biomarker panel out of 1,000 possible reiterations (10-fold×100different partitions) (Table 4). The top 130 markers were sequenced andanalyzed for homology to mRNA and genomic entries in the GenBankdatabases using BLASTn. The predicted amino acids were also determinedin-frame with the phage T7 gene 10 capsid protein. Of the top 130clones, there were 8 clones that contained known gene products in thereading frame of the T7 gene 10 capsid protein. These included multiplemyeloma overexpression gene 2, ubiquinone binding protein, NADHdehydrogenase subunit 1, C10 protein, and a hypothetical proteinLOC400242 (Table 5). The remaining 122 clones contained peptides thatwere different from the original proteins coded by the inserted genefragments. This occurred because the inserted gene fragments were out offrame with the open reading frame of the T7 10B gene (n=61) (Table 6a),represented untranslated region of known genes (n=18) (Table 6b), orcontained sequences from unknown genes (n=43) (Table 6c). It is likelythat the recombinant gene products of these clones mimic some othernatural antigens, and hence can be termed mimotopes. BLASTp search ofthe SWISSPROT database for homology to each in-frame mimotope revealedthat many of these gene products mimic known cancer proteins and as suchrepresent putative tumor antigens.

Discussion

Protein microarray technology was used for high throughput quantitativeanalysis of the antibody-antigen reaction between 238 serum samples and5,133 cloned cancer antigens pre-selected via biopanning. Usingspecialized bioinformatics, the massive dataset was mined to identify apanel of top 130 cancer biomarkers (FIG. 1). Many of these markersrepresent or mimic known cancer antigens. Based on this panel of 130cancer biomarkers, a trained neural network classifier has a sensitivityof 73.1% and specificity of 76.1% in discriminating cancer fromnoncancer serum samples in an independent test set. The performance ofthis classifier represents a significant improvement over currentdiagnostic accuracy in the primary care setting with reportedsensitivity of 37% to 46% and specificity of 24%.

The results presented in this study indicate the potential of a newplatform for cancer detection based on analysis of pattern of serumimmunoreactivity against a panel of cancer antigens. The pattern ofimmunoreactivity is highly reproducible. In addition, serum IgGs werefound to be extremely stable, which should minimize interlaboratoryvariations in clinical diagnostics setting. A significant advantage ofthe approach is the potential to translate this technology into simpleassay systems, such the enzyme-linked immunosorbent assay (ELISA), whichis already widely available in clinical practice and familiar toclinical laboratories. This facilitates the wide-spread application ofthis assay as a simple tool in the primary care setting to screen (inasymptomatic patients) and diagnose (in symptomatic patients) HNSCC inhigh risk population.

Finally, the pattern of reactivity of biomarkers with serum samples fromHNSCC can be analyzed to develop other classifiers capable of predictingclinical outcome and thereby guiding the most optimal therapeutictreatments. For example, the clinical outcomes that can be predictedinclude, but are not limited to, response to a particular therapeuticintervention or chemotherapeutic drug, survival, and development of neckor distant metastasis. These biomarkers also have utility inpost-treatment monitoring of HNSCC patients and provide new targets fortherapeutic interventions or diagnostic imaging in future clinicaltrials. Because the host immune system can reveal molecular events(overexpression or mutation) critical to the genesis of HNSCC, thisproteomics technology can also identify genes with mechanisticinvolvement in the etiology of the disease.

In conclusion, using a novel technology based on a combination of highthroughput antigen selection using microarray-based serologicalprofiling and specialized bioinformatics identified a panel of antigenbiomarkers that can provide sufficient accuracy for a clinicallyrelevant, serum-based cancer detection test based on the pattern ofserum immunoglobulin binding.

Throughout this application, various publications, including UnitedStates patents, are referenced by author and year and patents by number.Full citations for the publications are listed below. The disclosures ofthese publications and patents in their entireties are herebyincorporated by reference into this application in order to more fullydescribe the state of the art to which this invention pertains.

The invention has been described in an illustrative manner, and it is tobe understood that the terminology that has been used is intended to bein the nature of words of description rather than of limitation.

Obviously, many modifications and variations of the present inventionare possible in light of the above teachings. It is, therefore, to beunderstood that within the scope of the appended claims, the inventioncan be practiced otherwise than as specifically described.

TABLE 1a Tumor subsites and stages of the 12 HNSCC sera used in thesubtractive biopanning. Stage Stage Stage Subsites Stage I II III IVTotal Supraglottis 0 0 0 3 3 Glottis 1 0 0 1 2 Nasopharynx 0 0 0 0 0Oropharynx 0 0 0 3 3 Hypopharynx 0 0 0 0 0 Oral Cavity 0 0 0 3 3 Unknown0 0 0 1 1 Total 1 0 0 11 12

TABLE 1b Patient characteristics of the 12 HNSCC patients and 9 controlpatients whose sera were used in the subtractive biopanning. HNSCCControl Number of patients 12  9 Mean Age (range) 61.6 (46-87) 38(22-56) Race African American 6 5 Caucasian 6 3 Hispanics 0 1 GenderMale 11  5 Female 1 4

TABLE 2a Tumor subsites and stages of the 39 HNSCC sera used forimmunoscreening of the original 5,134 clones. Stage Stage Stage SubsitesStage I II III IV Total Supraglottis 1 2 1 5 9 Glottis 4 0 2 1 7Nasopharynx 1 0 0 0 1 Oropharynx 1 1 0 8 10 Hypopharynx 0 0 1 0 1 OralCavity 1 2 0 5 8 Cervical 0 0 0 1 1 Esophagus Unknown 0 0 0 2 2 Total 85 4 22 39

TABLE 2b Patient characteristics of the 39 HNSCC patients and 41 controlpatients whose sera were used for immunoscreening of the original 5,134clones. HNSCC Control Number of patients 39 41 Mean Age (range) 60(26-87) 47 (20-76) African American 22 27 Caucasian 17 13 Hispanics  0 1 Gender Male 35 27 Female  4 14

TABLE 3a Tumor subsites and stages of the 80 HNSCC sera used forimmunoscreening of the 1,021 clones arrayed on the master proteinmicrochips. Subsites Stage I Stage II Stage III Stage IV TotalSupraglottis 2 1 5 2 10 Glottis 2 1 3 8 14 Nasopharynx 1 1 0 2 4Oropharynx 1 1 0 14 16 Hypopharynx 0 1 3 5 9 Oral Cavity 2 5 1 13 21Cervical 0 0 0 1 1 Esophagus Unknown 0 0 0 5 5 Total 8 10 12 50 80

TABLE 3b Patient characteristics of the 80 HNSCC patients and 78 controlpatients whose sera were used for immunoscreening of the 1,021 clonesarrayed on the final protein microchips. HNSCC Control Number ofpatients 80 78 Mean Age (range) 60 (27-81) 51 (19-74) Race AfricanAmerican 45 45 Caucasian 35 33 Gender Male 59 57 Female 21 21

TABLE 4 Ranking of clones based on the number of times out of 1,000possible rankings they were selected in the 130 biomarker panel used tobuild the neural network classifier. Number of times selected Name Rankout of 1000 possibile times 1_H3 1 1000 1_H8 2 1000 10_C3 3 1000 10_F124 1000 10_G8 5 1000 10_H6 6 1000 2_D4 7 1000 2_D8 8 1000 2_G11 9 10002_G8 10 1000 2_H4 11 1000 5_C8 12 1000 6_C8 13 1000 6_D4 14 1000 6_G1115 1000 6_G12 16 1000 7_C11 17 1000 7_C4 18 1000 7_G4 19 1000 10_B11 20999 10_G11 21 999 11_C4 22 999 2_D3 23 999 3_D4 24 999 9_C4 25 999 1_D726 998 10_C9 27 998 7_C3 28 998 9_G4 29 998 9_G8 30 998 11_B12 31 9972_C10 32 996 2_G4 33 996 9_G12 34 995 11_C8 35 994 5_G3 36 994 8_C4 37994 10_C4 38 993 10_G5 39 993 10_G7 40 993 5_C11 41 991 1_D12 42 99010_D6 43 985 10_D12 44 982 4_G8 45 982 1_D11 46 980 5_H8 47 979 8_G11 48979 1_D2 49 978 6_H8 50 976 9_C12 51 976 11_B8 52 974 9_F12 53 97210_B12 54 969 2_G7 55 965 1_G4 56 954 5_C4 57 954 4_G7 58 953 1_H2 59952 9_G7 60 949 10_H8 61 947 11_C7 62 947 2_G3 63 946 2_D6 64 943 10_F1065 942 10_G12 66 934 8_G3 67 933 1_H7 68 931 2_C8 69 919 12_D10 70 9079_H4 71 904 10_H11 72 889 2_H3 73 877 2_C12 74 871 1_G3 75 868 11_B10 76866 5_C3 77 864 2_C11 78 857 11_C9 79 847 5_H4 80 845 1_C12 81 843 10_D782 840 2_H7 83 830 10_E12 84 819 9_C3 85 810 11_C3 86 809 2_C2 87 8095_C7 88 807 11_A1 89 796 10_F11 90 795 2_F3 91 787 2_H8 92 785 2_G2 93782 10_F8 94 775 10_H3 95 774 11_C1 96 761 9_C2 97 759 2_C3 98 728 6_C1299 728 10_B10 100 710 12_C4 101 697 2_C7 102 669 8_B12 103 659 5_D4 104653 5_G4 105 652 8_D4 106 644 1_H4 107 634 1_D6 108 598 4_G4 109 58710_B4 110 581 11_A2 111 571 10_E8 112 570 10_C1 113 558 8_H10 114 5586_G8 115 551 9_C6 116 541 1_D4 117 538 6_C4 118 528 6_G3 119 498 12_C12120 495 1_D8 121 489 5_C6 122 479 12_D2 123 453 6_F4 124 453 1_G8 125442 8_G8 126 431 9_H7 127 424 9_D11 128 418 6_C3 129 417 10_D3 130 40811_D4 131 406 2_F4 132 405 10_C8 133 403 4_B3 134 401 4_F8 135 400 9_B12136 389 2_D2 137 374 11_B6 138 373 4_C8 139 360 2_C1 140 349 5_G6 141348 11_C6 142 344 11_B2 143 326 3_C4 144 325 1_G7 145 320 11_D8 146 3171_C8 147 312 9_G10 148 306 8_B4 149 302 8_C8 150 293 12_D6 151 292 10_F6152 287 7_G1 153 287 7_G3 154 286 2_G10 155 284 8_B3 156 281 10_B8 157276 3_C8 158 272 1_G2 159 270 1_D3 160 269 9_F10 161 267 8_D8 162 2619_G11 163 251 5_C12 164 236 11_C11 165 233 4_C4 166 232 7_G2 167 2237_H4 168 217 6_A10 169 213 9_H5 170 210 11_A11 171 207 11_A3 172 1936_B4 173 183 10_G4 174 180 10_H9 175 176 2_H1 176 174 11_D5 177 1722_B11 178 163 8_F2 179 158 11_B9 180 150 11_B1 181 149 1_G12 182 14612_C7 183 145 9_B8 184 145 6_D11 185 143 4_C3 186 138 7_G8 187 130 9_C1188 121 10_B3 189 120 2_G12 190 119 11_D10 191 114 10_H7 192 113 5_F8193 112 11_C10 194 111 8_B7 195 111 2_C9 196 110 1_H11 197 109 11_B4 198109 10_F7 199 104 8_F3 200 104 9_H6 201 100 4_G3 202 98 5_G8 203 98 7_C9204 93 1_G6 205 92 11_C12 206 89 11_D3 207 88 2_C4 208 84 5_G2 209 831_C4 210 80 5_G10 211 79 1_C10 212 78 11_A4 213 75 3_G7 214 74 12_D11215 73 7_G6 216 73 8_C11 217 73 11_B3 218 71 6_F8 219 70 9_C10 220 681_B10 221 67 1_C11 222 67 5_H9 223 67 9_B4 224 66 11_D7 225 65 2_G1 22664 10_C11 227 60 12_D12 228 58 9_G2 229 56 4_D7 230 55 1_H6 231 54 8_H6232 52 10_H10 233 49 5_D8 234 49 12_D4 235 48 12_C9 236 47 2_F7 237 479_F1 238 45 1_H10 239 43 8_G5 240 42 6_C7 241 41 2_G6 242 37 8_H11 24336 12_D9 244 32 2_F11 245 32 12_C8 246 31 5_G11 247 31 11_C5 248 3011_D9 249 30 2_D11 250 30 9_F8 251 29 10_B9 252 28 4_D10 253 28 6_B10254 28 5_C10 255 27 10_F4 256 26 5_D12 257 26 9_G3 258 26 2_D1 259 255_H11 260 25 9_C11 261 25 2_B6 262 23 8_G4 263 23 9_B3 264 21 11_B11 26519 6_B8 266 19 6_F2 267 19 7_D4 268 19 7_H1 269 19 2_B8 270 18 3_D2 27118 8_D10 272 18 6_F7 273 17 8_G12 274 17 9_C9 275 17 11_A12 276 15 12_B4277 15 2_C6 278 15 6_D7 279 15 10_H4 280 14 2_H2 281 14 6_F3 282 1411_C2 283 13 2_F8 284 13 10_G6 285 12 2_B2 286 12 1_F4 287 11 10_H5 28811 12_D8 289 11 4_H1 290 11 7_G11 291 11 8_C12 292 11 9_H8 293 11 11_A5294 10 3_C3 295 10 5_F11 296 10 10_C12 297 9 3_H8 298 9 4_B4 299 9 8_H3300 9 10_B7 301 8 10_C7 302 8 11_A6 303 8 11_D11 304 8 11_D2 305 8 2_F2306 8 6_B6 307 8 6_B9 308 8 7_C8 309 8 10_A12 310 7 10_D8 311 7 11_D1312 7 12_A9 313 7 12_C11 314 7 2_B3 315 7 4_H4 316 7 6_B11 317 7 6_B3318 7 9_B11 319 7 12_A1 320 6 12_A11 321 6 3_G8 322 6 11_A10 323 5 2_G5324 5 3_D3 325 5 8_G7 326 5 1_C7 327 4 1_D10 328 4 10_H2 329 4 12_C3 3304 2_B10 331 4 2_D7 332 4 2_H6 333 4 5_F4 334 4 5_G12 335 4 6_D6 336 47_B12 337 4 8_D2 338 4 9_D6 339 4 1_A6 340 3 1_C6 341 3 12_B5 342 3 3_H2343 3 6_G7 344 3 7_A1 345 3 7_D3 346 3 7_H3 347 3 9_C7 348 3 9_E2 349 39_F3 350 3 1_C3 351 2 1_F3 352 2 1_G11 353 2 3_D12 354 2 3_H5 355 2 5_B8356 2 6_A9 357 2 7_E1 358 2 8_H2 359 2 9_D2 360 2 9_E12 361 2 9_E5 362 29_E9 363 2 1_D9 364 1 1_G9 365 1 10_D4 366 1 10_E11 367 1 10_E3 368 110_E4 369 1 10_E5 370 1 11_B7 371 1 12_B8 372 1 12_B9 373 1 12_D3 374 112_D7 375 1 2_F6 376 1 3_C2 377 1 3_G6 378 1 5_A9 379 1 5_B3 380 1 6_B12381 1 7_F4 382 1 8_B8 383 1 8_C3 384 1 8_G6 385 1 9_B9 386 1 9_E1 387 19_E10 388 1 9_H10 389 1

TABLE 5 Eight of the 130 top clones represented epitopes. These clonescontained gene fragments in the reading frame of the T7 gene 10 capsidgene and thus expressed the same original peptides coded by the insertedgene fragments. Peptide sequences of Size Description of the Mimotopesof sequences that are Description of the genes that are in-frame withpep- in-frame with T7 Rank Clone in-frame with T7 10B gene T7 10B genetide 10B gene 4 10_F12 gi|19717685|gb|AF487338.1| AEMFPEG 51gi|20503274|gb|AAL96264.2| H sapiens multiple myeloma AGPYVDL Multiplemyeloma overexpression gene 2 DEAGGST overexpression (MYEOV2) mRNAGLLMDLA gene 2 Length = 452, ANEKAVH Score = 698 bits(352) ADFFNDFExpect = 0 EDLFDDD Id = 363/368 (98%) DIQ Gaps = 0/368(0%) 12 5_C8gi|190805|gb|M37387.1|HUMQBPIC LVSRPFQ 20 gi|190806|gb|AAA60361.1| Humanmitochondrial HQASGW Ubiquinone ubiquinone-binding protein MVFENGIbinding protein (HQPI) gene, exon 2 TMLQDSI Length = 75 NWG Score = 131bits(66) Expect = 2e−28 Id = 73/74(98%) Gaps = 1/74(1%) 19 7_G4gi|78499271|gb|DQ246833.1| PPWSPIVE 27 gi|29690911|gb|AAQ88761.1| Hsapiens isolate IND23 LLDAQVA NADH mitochondrion, complete ADPDKLVdehydrogenase genome, ERFELAV subunit 1 Length = 16320 DALSPEV Score= 218 bits (110) YTTYFVT Expect = 2e−54 KTLLLTS Id = 118/121(97%) LFLGaps = 0/121 (0%) Product = NADH dehydrogenase subunit 1 40 10_G7gi|19717685|gb|AF487338.1| AEMFPEG 51 gi|20503274|gb|AAL96264.2| Hsapiens multiple myeloma AGPYVDL Multiple myeloma overexpression gene 2DEAGGST overexpression (MYEOV2) mRNA GLLMDLA gene 2 Length = 452,ANEKAVH Score = 698 bits(352) ADFFNDF Expect = 0 EDLFDDD Id = 363/368(98%) DIQ Gaps = 0/368(0%) 41 5_C11 gi|190805|gb|M37387.1|HUMQBPICLVSRPFQ 20 gi|190806|gb|AAA60361.1| Human mitochondrial HQASGWUbiquinone ubiquinone-binding protein MVFENGI binding protein (HQPI)gene, exon 2 TMLQDSI Length = 75 NWG Score = 131 bits(66) Expect = 2e−28Id = 73/74(98%) Gaps = 1/74(1%) 66 10_G12gi|1633547|gb|U47924.1|HSU47924 GRKGLEF 48 gi|19923951|ref|NP612434.1Human cromosome 12p13 ARLVKSY C10 protein sequence, complete sequenceEAQDPEI Length = 222930 ASLSGKL Features in this part of subject KALFLPPsequence: C10 MTLPPHG Score = 430 bits (217) PAAGGSV Expect = 8e−118 AASId = 224/227 (98%) Gaps = 0/227 (0%) 81 1_C12gi|5360992|emb|AL023494.12|HS366L4 GWSAMA 28 gi|46409510|ref|NP997326.1|Human DNA sequence from QSWLTAT hypothetical clone RP3-366L4 on STSRVQVIprotein chromosome 22q11.23-12.2, LLPQPPE LOC400242 Length = 37658 Score= 309 Bits (156) Expect = 6e−82 Id = 156/156 (100%) Gaps = 0/156 (0%)Strand = Plus/Minus 111 11_A2 gi|19717685|gb|AF487338.1| AEMFPEG 51gi|20503274|gb|AAL96264.2| H sapiens multiple myeloma AGPYVDL Multiplemyeloma overexpression gene 2 DEAGGST overexpression (MYEOV2) mRNAGLLMDLA gene 2 Length = 452 ANEKAVH Score = 698 bits(352) ADFFNDF Expect= 0 EDLFDDD Id = 363/368 (98%) DIQ Gaps = 0/368(0%) Rank Region ofsimilarity of peptide 4 Score = 168 bits (389), Expect = 3e−42 Id= 51/51 (100%), Pos = 51/51 (100%) Gaps = 0/51 (0%) Query2EMFPEGAGPYVDLDEAGGSTGLLMDLAANEKAVHADFFNDFEDLFDDDDIQ       EMFPEGAGPYVDLDEAGGSTGLLMDLAANEKAVHADFFNDFEDLFDDDDIQ Sbjct7EMFPEGAGPYVDLDEAGGSTGLLMDLAANEKAVHADFFNDFEDLFDDDDIQ 12 Score = 72.7 bits(164) Expect = 6e−14 Id = 20/20 (100%) Pos = 20/20 (100%) Gaps = 0/20(0%) Query 10 ASGWMVFENGITMLQDSINW 29          ASGWMVFENGITMLQDSINWSbjct  4 ASGWMVFENGITMLQDSINW 23 19 Score = 49.3 bits (116) Expect= 5e−05 Id = 23/27 (85%) Pos = 25/27 (92%) Gaps = 0/27 (0%) Query 27 LAVDALSPEVYTTYFVTKTLLLTSLFL  53          +  DALSPE+YTTYFVTKTLLLTSLFL Sbjct 245MTYDALSPELYTTYFVTKTLLLTSLFL 271 40 Score = 168 bits (389) Expect = 3e−42Id = 51/51 (100%) Pos = 51/51 (100%) Gaps = 0/51 (0%) Query2EMFPEGAGPYVDLDEAGGSTGLLMDLAANEKAVHADFFNDFEDLFDDDDIQ       EMFPEGAGPYVDLDEAGGSTGLLMDLAANEKAVHADFFNDFEDLFDDDDIQ Sbjct7EMFPEGAGPYVDLDEAGGSTGLLMDLAANEKAVHADFFNDFEDLFDDDDIQ 41 Score = 72.7 bits(164) Expect = 6e−14 Id = 20/20 (100%) Pos = 20/20 (100%) Gaps = 0/20(0%) Query 10 ASGWMVFENGITMLQDSINW 29          ASGWMVFENGITMLQDSINWSbjct  4 ASGWMVFENGITMLQDSINW 23 66 Score = 144 bits (333) Expect= 7e−35, Id = 47/48 (97%) Positives = 48/48 (100%) Gaps = 0/48 (0%) Q 11LEFARLVKSYEAQDPEIASLSGKLGALFLPPMTLPPHGPAAGGSVAAS  58     L+FARLVKSYEAQDPEIASLSGKLGALFLPPMTLPPHGPAAGGSVAAS S 79LKFARLVKSYEAQDPEIASLSGKLGALFLPPMTLPPHGPAAGGSVAAS 126 81 Score = 82.5bits (187) Expect = 7e−17 Id = 25/28 (89%) Positives = 26/28 (92%) Gaps= 0/28 (0%) Query   1 GWSAMAQSWLTATSTSRVQVILLPQPPE  28          GWASAMAQSWLT TS S+VQVILLPQPPE Sbjct 121GWSAMAQSWLTATSTSRVQVILLPQPPE 148 111 Score = 168 bits (389) Expect= 3e−42 Id = 51/51 (100%) Pos = 51/51 (100%) Gaps = 0/51 (0%) Query2EMFPEGAGPYVDLDEAGGSTGLLMDLAANEKAVHADFFNDFEDLFDDDDIQ       EMFPEGAGPYVDLDEAGGSTGLLMDLAANEKAVHADFFNDFEDLFDDDDIQ Sbjct7EMFPEGAGPYVDLDEAGGSTGLLMDLAANEKAVHADFFNDFEDLFDDDDIQ

Table 6. One hundred twenty-two of the top 130 clones contained peptideswhich were different from the original proteins coded by the insertedgene fragments. This occurred because a) the inserted gene fragment wasout of frame with the open reading frame of the T7 10B gene (n=61), b)the inserted gene fragment represented untranslated region of known gene(n=18), or c) the inserted gene fragment represented sequences fromunknown gene (n=43).

TABLE 6a Peptide se- quences of Mimotopes, Size Description of the genesin-frame of that are in Mimotope with T7 pep- Description of thesequences Region of similarity of peptide Rank Clone clones 10 B genetide that Mimotopes mimic (only the top 3 sequences are shown)  1 1_H3gi|13097209|gb|BC00337 KLLHQGAR 24 gi|2135396|pir|S71548 Score = 28.6bits (60), Expect = 1.1 0.1| RRRGLRTP Homeotic protein pG2 Id = 12/22(54%), Positives = 14/22 (63%) Homo sapiens cystatin B ASVPISPSgi|85726204|gb|ABC79623.1| Gaps = 3/22 (13%) (stefin B) (cDNA clonecytokeratin associated Query 4 MGC: 17497 protein LHQGARRRRGLRTPASVPISPS25 IMAGE: 3453675), gi|4505037|ref|NP_003564.1| L QG RRR  L   +SVP +PScomplete cds Latent transforming growth Sbjct 228 Length = 613 factorbeta binding protein LQQGGRPRGDL---SSVPTAPS 246 Score = 517 bits (261) 4Score = 26.1 bits (54), Expect = 9.3 Expect = 1e−144gi|56206067|emb|CAI23573.1 Identities = 9/11 (81%), Positives = 9/11(81%), Id = 265/267 (99%) Myc target 1 Gaps = 1/11 (9%) Gaps = 0/267(0%) gi|1703642|gb|AAB37683.1| Query 7 Strand = Plus/Plus p65 Net1proto-oncogene ARRRRGLRTPA 17 protein ARRRA LRT A Sbjct 292 ARRRR-LRTAA301 Score = 24.4 bits (50), Expect = 21 Id = 9/15 (60%), Positives= 9/15 (60%), Gaps = 5/15 (33%) Query 9 RRRRGLRTPASVPIS 23 RRRR     ASPIS Sbjct 95 RRRR-----ASAPIS 104  2 1_H8 gi|20357564|ref|NM_0001 VIQEPG34 gi|28892|emb|CAA35582.1| Score = 30.3 bits (64), Expect = 0.34 00.2|GRGDKL Unnamed protein product Id = 12/22 (54%), Positives = 14/22(63%), Homo sapiens cystatin B LHQGAR gi|2135396|pir||S71548 Gaps = 3/22(13%) (stefin B) (CSTB), RRRCLR Homeotic protein pG2 Query 9 mRNAcystatin B (liver TPASVPI gi|48428276|sp|O43432|1F4GLHQGARRRXGLRTPASVPISPS 30 thiol proteinase SPS 3_HUMAN L QGRRR  L   +SVP +PS inhibitor) Eukaryotic translation Sbjct 227 Length= 674 initiation factor 4 LQQGGRRRGDL---SSVPTAPS 245 Score = 424 bits(214) gamma3(eIF-4-gamma3) Score = 26.9 bits (56), Expect = 4.6 Expect= 6e−116 (eIF-4-gamma II) Id = 15/28 (53%), Positives = 17/28 (60%), Id= 214/214 (100%) gi|2829451|sp|P56182|NNP1 Gaps = 7/28 (25%) Gaps= 0/214 (0%) HUMAN NNP-1 protein Query 1 Strand = Plus/Plus (Novelnuclear protein 1) GRVIQEPGGRGDKLLHQGARR-----RR 23 (Nucleolar proteinNop52) GR  Q PGGRG  LL+ G+RR     RR gi|2190402|emb|CAA73944.1| Sb 880latent TGF-beta binding GR-QTPGGRGVPLLNVGSRRSQPGQRR 905 protein-4 Score= 25.2 bits (52), Expect = 28 gi|56206067|emb|CAI23573.1 Id = 11/20(55%), Positives = 13/20 (65%), myc target 1 Gaps = 6/20 (30%)gi|34098712|sp|Q9H0U9|TSY Query 15 L1_HUMANTestis-specificGRGDKLLHQGARRRRGLRTP 34 Y-encoded-like protein 1 GRG     +GAR+RR  RTPgi|1703642|gb|AAB37683.1| Sbjct 424 p65 Net1 proto-oncogeneGRG----QRGARQRR--RTP 437 protein  6 10_H gi|75905883|gb|AY96358 NSSVD 5gi|123291744|emb|CA114853. Score = 18.0 bits (35), Expect = 1284 6 5.2|2| ADAMTS-like Id = 5/5 (100%), Positives 5/5 (100%), Homo sapiensisolate gi|119617528|gb|EAW97122. Gaps = 0/5 (0%) 14_LOf(Tor65)1| SLIT-ROBO Rho Query 1 mitochondrion GTPase activating protein NSSVD 5Length = 16569 1, isoform CRA_a NSSVD Score = 232 bits (117)gi|119591326|gb|EAW70920. Sbjct 541 Expect = 2e−58 1|SP140 nuclear bodyNSSVD 545 Id = 119/120 (99%) protein, isoform CRA_a Gaps = 0/120 (0%)gi|119620377|gb|EAW99971. Strand = Plus/Minus 1| EH domain bindingprotein 1, isoform CRA_c 10 2_G8 gi|39992414|gb|BC06441 GSGKIKKSV 20gi|56202484|emb|CAI21939.1 Score = 24.0 bits (49), Expect = 29 8.1| Homosapiens LWDRKVGI HBxAg transactivated Id = 9/13 (69%), Positives = 9/13(69%), FK506 binding protein RKN protein 2 (XTP2) Gaps = 0/13 (0%) 9,63kDa, mRNA gi|119611307|gb|EAW90901. Query 2 (cDNA 1|BAT2 domaincontaining SGKIKKSVIMDRK 14 cloneIMAGE: 5750487), 1, isoform CRA_d SGIKK VL D K partial cds gi|29427863|sp|Q8WUM0|N Sbjct 83 Length = 2421U133_HUMAN SGPIKKPVLRDMK 95 Score = 486 bits (245) Nuclear pore complexScore = 24.0 bits (49), Expect = 41 Expect = 1e−134 protein Nup133 Id= 6/6 (100%), Positives = 6/6 (100%), Id = 259/264 (98%) (NuclcoporinNup133) Gaps = 0/6 (0%) Gaps = 0/264 (0%) (133 kDa nucleoporin) Query 8Strand = Plus/Minus gi|61216828|sp|Q96HF1|SFR SVLWDR 13 P2_HUMANSecreted SVLWDR apoptosis - related protein 1 Sbjct 101gi|12803735|gb|AAH02704.1 SVLWDR 106 Signal transducer and Score = 23.1bits (47), Expect = 73 activator of transcription 1 Id = 6/6 (100%),Positives = 6/6 (100%), gi|29337296|ref|NP_803881.1 Gaps = 0/6 (0%)Melanoma antigen family D, Query 6 4 isoform 2 KKSVLW 11gi|57012811|sp|Q7Z419|IBR KKSVLW D2_HUMAN Sbjct 233 E3 ubiquitin ligaseIBRDC2 KKSVLW 238 (p53- inducible RING finger protein) 11 2_H4gi|39992414|gb|BC06441 GSRKIXWT 56 Gi|4758932|ref|NP_004563.1| Score= 48.1 bits (106), Expect = 5e−06 8.1| ALWETKVG Plakophilin 2 isoform 2bId = 15/17 (88%), Positives = 16/17 (94%), Homo sapiens FK506 LCLKLKMDgi|15929032|gb|AAH14974.1| Gaps = 0/17 (0%) binding protein 9, 63EPCLSHAC VIF1B protein Query 30 kDa, mRNA(cDNA YPNTLGGQgi|15341934|gb|AAH13155.1| HACYPNTLGGQGGRITR 46 cloneIMAGE: 5750487),GGRITRXR CRYZL1 protein HAC P+TLGGQGGRITR partial cds LRPSWPTQgi|66932986|ref|NP_0010193 Sbjct 477 Length = 2421 86.1| filamin-bindingLIM HACNPSTLGGQGGRITR 493 Score = 307 bits (155) protein-1 isoform bScore = 43.9 bits (96), Expect = 9e−05 Expect = 3e−81gi|386941|gb|AAA59814.1| Id = 14/17 (82%), Positives = 16/17 (94%), Id= 158/159 (99%) MHC HLA-DR-beta-1 Gaps = 0/17 (0%) Gaps = 0/159 (0%)chain Query 30 Strand = Plus/Minus gi|11038659|ref|NP_067610.1HACYPNTLGGQGGRITR 46 ADAM metallopeptidase HAC P+TLGG+GGRITR withtlironibospondin type 1 Sbjct 270 gi|62088182|dbj|BAD92538.1HACNPSTLGGRGGRITR 286 SLC2A11 protein variant Score = 41.4 bits (90),Expect = 5e−04 gi|33356179|ref|NP_031370.2 Id = 13/16 (81%), Positives= 14/16 (87%), transcription termination Gaps = 0/16 (0%) factor, RNApolymerase 1 Query 30 gi|7416053|dbj|BAA93676.1| HACYPNTLGGQGGRIT 45survivin-beta H C P+TLGGQGGRIT gi|3335138|gb|AAC39892.1| Sbjct 98 RNApolyinerase 1 40 kD HVCNPSTLGGQGGRIT 113 gi|522145|gb|AAB02649.1| B-cellgrowth factor 15 6_G1 gi|33876180|gb|BC00141 LAQLQHGK 15gi|38372397|sp|Q9ULK0|GR1 Score = 23.5 bits (48), Expect = 54 10.2| Homo sapiens S100 NLQPYRD D1_HUMAN Glutamate Id = 8/12 (66%),Positives = 8/12 (66%), calcium binding protein receptor delta-1 subunitGaps = 3/12 (25%) A11 (calgizzarin), precursor (GluR delta-1) Query 4mRNA (cDNA clone gi|2499758|sp|Q92729|PTPR LQHGKNLQPYRD 15 MGC: 2149U_HUMAN LQHG    PYRD IMAGE: 3140092), Receptor type tyrosine Sbjct 774complete cds protein phosphatase U LQHGS---PYRD 782 Length = 568precursor Score = 22.3 bits (45), Expect = 95 Score 844 bits (426)(Pancreatic carcinoma Id = 6/8 (75%), Positives = 7/8 (87%), Expect= 0.0 phosphatase 2) (PCP-2) Gaps = 0/8 (0%) Id = 430/432 (99%)gi|2441904|jgb|AAL65133.2| Query 8 Gaps = 0/432 (0%) Ovarian cancerrelated KNLQPYRD 15 Strand = Plus/Plus tumor marker CA125 KNL PYR+ /gene= “100A11” gi|30060232|gb|AAP13073.1| Sbjct 451 E3 ligase for inhibinKNLLPYRN 458 receptor Score = 22.3 bits (45), Expect = 95gi|5053128|gb|AAD33053.2| Id = 9/14 (64%), Positives = 10/14 (71%),Scar2 Gaps = 1/14 (7%) gi|59800456|sp|Q9Y6W5|WA Query 1 SF2_HUMANLAQLQHG-KNLQPY 13 Wiskott-Aldrich syndrome L+QL HG K L PY protein familymember 2 Sbjct 13105 (WASP-family protein LSQLTHGIKELGPY 13118 member2)(WAVE-2 Score 22.3 bits (45), Expect = 95 protein) Id = 9/14 (64%),Positives = 10/14 (71%), Gaps = 1/14 (7%) Query 1 LAQLQHG-KNLQPY 13 L+QLHG K L PY Sbct 14041 LSQLTHGIKELGPY 14054 Score = 19.7 bits (39), Expect= 556 Id = 10/19 (52%), Positives = 11/19 (57%), Gaps = 4/19 (21%) Query1 LAQLQHG-KNLQPY---RD 15 L+QL HG   L PY   RD Sbjct 20748LSQLTHGITELGPYTLDRD 20766 Score = 18.0 bits (35), Expect = 1802 Id= 8/14 (57%), Positives = 9/14 (64%), Gaps = 1/14 (7%) Query 1LAQLQHG-KNLQPY 13 L+QL HG   L PY Sbjct 15291 LSQLTHGITELGPY 15304 2110_G gi|17390908|gb|BC01838 NSSL 4 gi|125987841|sp|Q99102|MU Score= 14.6 bits (27), Expect = 10793 11 6.1| Homo sapiens C4_HUMAN Id = 4/4(100%), Positives = 4/4 (100%), cytochrome c oxidase Pancreaticadenocarcinoma Gaps = 0/4 (0%) subunit VIIb, mRNA mucin Query 1 (cDNAclone MGC: 9065 gi|125863935|sp|Q8N584|CR NSSL 4 IMAGE: 3861730)017_HUMAN NSSL Length = 454 TPR repeat-containing Sbjct 44 Score = 549bits (277), protein C18orf17 NSSL 47 Expect = e−153gi|121944808|emb|CAK5073 Id = 316/328 (96%), 5.1| immunoglobulin A Gaps= 3/328 (0%) heavy chain variable region Strand = Plus/Plusgi|120431468|gb|ABM21709. tissue_type = “Ovary, 1| env proteinadenocarcinoma” gi|119943116|ref|NP_001073 /gene = “COX7B” 328.1| Gprotein-coupled /product = “cytochrome c receptor 64 isoform 2 oxidasesubunit VIIb, precursor” 22 11_C gi|4503528|ref|NM_00141 VDSRTRSM 19gi|47458813|ref|NP_006242.4 Score = 28.6 bits (60), Expect = 1.1 4 6.1|TYSKSSTAT Protein kinase, AMP- Id = 14/27 (51%), Pos = 16/27 (59%),Eukaryotic translation PR activated, a-1 catalytic Gaps = 9/27 (33%)initiation factor 4A, subunit isoform 1 Query 1 isoform I (EIF4A1)(PRKAA1) VDSRT-----RSM----TYSKSSTATP 18 mRNA gi|47605556|sp|Q9NYF8|BCLVDSRT     RS+    T +KS TATP Length = 1383 F1_HUMAN Bcl-2 Sbjct471 Score= 371 bits(187) associated transcriptional VDSRTYLLDSRSIDDEITEAKSGTATP497 Expect = 5e−100 factor (Btl) Score = 25.7 bits (53), Expect = 9.0 Id= 199/205(97%) gi|8039798|sp|P30414|NKCR Id = 8/11 (72%), Pos = 10/11(90%), Gaps = 0/205(0%) _HUMAN NK-tumor Gaps = 0/11 (0%) Strand= Plus/Plus recognition protein Query 3 gi|5870841|gb|AAA35734.2|SRTRSMTYSKS 13 cyclophilin-related SR+RS TYS+S protein Sbjct 36gi|24419041|gb|AAL65133.2| SRSRSRTYSRS 46 Ovarian cancer related Score= 23.5 bits (48), Expect = 52 tumor marker CA125 Id = 9/18 (50%), Pos= 12/18 (66%), gi|45768720|gb|AAH67812.1| Gaps = 6/18 (33%) Cyclin L1Query 10 RTRSMTY-----------SK SST 21 RTRS++Y    S+  SST Sbjct 1328RTRSVSYSHSRSRSRSST 1345 26 1_D7 gi|57165047|gb|AY86482 LPRVQAQG 64gi|148429165|sp|Q86V8|THO Score = 36.7 bits (79), Expect = 0.016 4.1|GRVPEETE C4_HUMAN Id = 19/37 (51%), Positives = 21/37 (56%), Homosapiens CDC37 GAGGGRGR THO complex subunit 4 Gaps = 14/37 (37%) celldivision cycle 37 QGRAGAPA (Tho4) (Transcriptional Query 8 homolog (S.cerevisiae) GRGTAAAQ coactivator Aly/REF)GGRVPEETEGAGGGRGRQGRAGAPAGRGTAAAQGGAE 44 (CDC37) gene, complete GGAELGAEgi|47117879|sp|P83369|LSM1 GGR        GGGRCR GRAG+  GRG     GGA+ cdsAGGDAQEG 1_HUMAN Sbjct 21 Length = 16460 SLRPHSSN U7 snRNA-associatedSm- GGR--------GGGRGR-GRAGSQGGRG-----GGAQ 43 Score = 333 bits (168) likeprotein LSm11 Score = 35.0 bits (75), Expect = 0.069 Expect = 4e−89gi|62087836|dbj|BAD92365.1 Identities = 17/24 (70%), Positives = 17/24(70%), Id = 168/168 (100%) serum response factor Gaps = 2/24 (8%) Gaps= 0/168 (0%) (c-fos serum response Query 19 Strand = Plus/Pluselement-binding GGGRGRQGRA-GAPAGRGTAAAQG 41 transcriptional factor)GGGRGR GRA GA AG G AA  G variant Sbjct 69 gi|46397045|sp|Q9NQ03|SCRGGGRGR-GRARGAAAGSGVPAAPG 91 T2_HUMAN Score = 28.6 bits (60), Expect= 4.3 Transcriptional repressor Identities = 24/58 (41%), Positives= 24/58 (41%), scratch 2 Gaps = 17/58 (29%) Q6AQGGRVPEETEGAGGGRGRQGRAGAPAGRGTAAAQGGA---------------ELGAE 48 AGGRVP    GAG G GR  R  A A   T A   GA               ELGAE S49ANGGRVPG--NGAGLGPGRLEREAAAAAATTPAPTAGALYSGSEGDSESGEEEELGAE 104 27 10_Cgi|31107|emb|Z11692.1|H QTTPGDCP 14 gi|135959|sp|P19438|TNR1A Score= 24.0 bits (49), Expect = 41 9 SEF2MR H.sapiens DTATLP _HUMAN Id = 6/7(85%), Positives = 7/7 (100%), mRNA for elongation Tumor necrosis factorGaps = 0/7 (0%) factor 2 receptor superfamily Query 3 Length = 3080member 1A precursor TPGDCPD 9 Score = 848 bits (428) (p60) (TNF-R1)(TNF-R1) TPGDCP+ Expect = 0.0 (TNFR-1) (p55) (CD120a Sbjct 300 Id= 430/431 (99%) antigen) TPGDCPN 306 Gaps = 0/431 (0%) Tumor necrosisfactor- Score = 23.1 bits (47), Expect = 74 Strand = Plus/Plus bindingprotein 1 (TBP1) Id = 6/6 (100%), Positives = 6/6 (100%),gi|119587880|gb|EAW67476. Gaps = 0/6 (0%) 1| Cas-Br-M (murine) Query 3ecotropic retroviral TPGDCP 8 transforming sequence TPGDCPgi|119574887|gb|EAW54502. Sbjct 514 1| ubiquitin specific TPGDCP 519peptidase 54 Score = 23.1 bits (47), Expect = 74gi|115855|sp|P22681|CBL_(—) Id = 11/17 (64%), Positives = 11/17 (64%),HUMAN Gaps = 5/17 (29%) E3 ubiquitin-protein ligase Query 2(Proto-oncogene c-CBL) TTPGDC-PD---TATLP 14 (Casitas B-lineage TTPG CP    TATLP lymphoma proto-oncogene) Sbjct 1375 TTPG-CNPQLTYTATLP 1390 299_G4 gi|71482588|ref|NM_0010 NSSV 4 gi|126215737|sp|Q1L5Z9|LO Score= 14.6 bits (27), Expect = 10793 20.4| NF2 HUMAN Id = 4/4 (100%),Positives = 4/4 (100%), Homo sapiens ribosomal Neuroblastoma apoptosis-Gaps = 0/4 (0%) protein S16 (RPS16), related protease Query 1 mRNAgi|120587027|ref|NP_065769. NSSV 4 Length = 603 3| ubiquitin specificNSSV Score = 260 bits (131) peptidase 31 Sbjct 206 Expect = 4e−67gi|119626680|gb|EAX06275.1 NSSV 209 Id = 131/131 (100%) alpha-kinase 1Gaps = 0/131 (0%) gi|119620124|gb|EAW99718. Strand = Plus/Plus 1| dualspecificity phosphatase 11 31 11_B gi|74027260|ref|NM_0068 ERGKKRKI 35gi|57863277|ref|NP_0010099 Score = 29.9 bits (63), Expect = 0.55 1246.2| REMLQDM 21.1| hypothetical protein Id = 10/18 (55%), Positives= 13/18 (72%), Homo sapiens serine VPVVVEEE LOC23355 isoform a Gaps= 3/18 (16%) peptidase inhibitor, TLRTNVLSI gi|14042300|dbj|BAB55190.1|Query 7 Kazal type 5 (SPINK5), GNK unnamed protein productKIREM---LQDMVPVVVE 21 mRNA gi|50083293|ref|NP_116205.3KI+ M    QDMVPV+V+ Length = 3610 hypothetical protein Sbjct 569 Score= 216 bits (109), LOC84902 KIQVMEQHFQDMVPVIVD 586 Expect = 1e−53gi|2815622|gb|AAC39562.1| Score = 29.1 bits (61), Expect = 0.98 Id= 109/109 (100%), kinesin-related protein Id = 12/25 (48%), Positives= 14/25 (56%), Gaps = 0/109 (0%) gi|2407245|gb|AAB70531.1| Gaps = 10/25(40%) Strand = Plus/Plus putative transcription factor Query 6 CR53RKIREMLQDMVPVVVEEETL-RTNV 29 gi|38605529|sp|Q9Y4A5|TRRR+IRE+LQD         TL RT V AP_HUMAN Sbjct730 Transformation/transcriptionRRIRELLQD---------TLTRTGV 745 domain-associated protein Score 28.6 bits(60), Expect = 1.3 (350/400 kDa PCAF- Id = 12/24 (50%), Positives= 17/24 (70%), associated factor) Gaps = 4/24 (16%)gi|14589891|ref|NP_001784.2 Query 5 cadherin 3, type 1KRKIREMLQDMVPVVVEEET--LR 26 preproprotein KR+ REM Q+M  ++ +EET  LRgi|226518|prf||1516312A Sbjct 1 Ca dependent cell adhesionKRREREMQQEM--MLRDEETMELR 22 protein gi|4099609|gb|AAD00657.1| celldivision control-related protein 2b 32 2_C1 gi|14506788|ref|NM_00297RKGKKSKR 14 gi|57209583|emb|CA141374.1 Score = 25.7 bits (53), Expect= 9.1 0 0.1| RKWLNS Sarcoma antigen 1 Identities = 10/20 (50%),Positives = 12/20 (60%), Homo sapiens gi|8216987|emb|CAB92443.1| Gaps= 7/20 (35%) spermidine/spermine Putative tumor antigen Query 1N1-acetyltransferase gi|23396625|sp|Q9NQT8|K11 RKGKK---SKRRK----WLN 13(SAT), mRNA 3B_HUMAN RK K+   SKRRK    WL+ Length = 1060 Kinesin-likeprotein Sbjct 51 Score = 1031 bits(520) KIF13B (Kinesin-likeRKSKRHSSSKRRKSMSSWLD 70 Expect = 0.0 protein GAKIN) Score = 24.8 bits(51), Expect = 22 Id = 520/520(100%) gi|7227890|sp|O24175|FL_O Id = 6/6(100%), Positives = 6/6 (100%), Gaps = 0/520(0%) RYSA Gaps = 0/6 (0%)Strand = Plus/Plus Putative transcription Query 8 factor FL (RFL) RRKWLN13 gi|51701343|sp|Q8TD10|CHD RRKWLN 5_HUMAN Sbjct 1092 Chromodomainhelicase- RRKWLN 1097 DNA-binding protein Score = 24.4 bits (50), Expect= 30 5(CHD-5) Id = 7/10 (70%), Positives = 10/10 (100%),gi|126178|sp|P14151|LYAM1 Gaps = 0/10 (0%) _HUMAN Query 1 L-selectinprecursor RKGKKSKRRK 10 (Lymph node homing RKGKK++R+K receptor)(Leukocyte Sbjct 172 surface antigen Leu- RKGKKARRKK 181 8) (TQ1)(gp90-MEL) (Leukocyte- endothelial cell adhesion molecule 1) (LECAM1)(CD62L antigen) 35 11_C gi|45359854|ref|NM_0020 TKAK 4gi|54607086|ref|NP_068756.2 Score = 14.6 bits (27), Expect = 7683 878.3| elongation factor for Id = 4/4 (100%), Positives = 4/4 (100%),Homo sapiens golgi selenoprotein translation Gaps = 0/4 (0%)autoantigen, golgin gi|57863304|ref|NP_056340.2 Query 1 subfamily a, 4inhibitor of Bruton's TKAK 4 (GOLGA4), mRNA tyrosine hinase TKAK Length= 7717 gi|52548190|gb|AAU82085.1| Sbjct 318 Score = 270 bits (136) tumoramplified and TRAK 321 Expect = 1e−69 overexpressed sequence 2 Id= 138/139 (99%) gi|20800447|gb|AAF23433.3| Gaps = 0/139 (0%) breastcancer-associated Strand = Plus/Minus antigen BRCAA1 37 8_C4gi|92443613|gb|BC01555 PNSG 4 gi|124001556|ref|NP_001028 Score = 15.1bits (28), Expect = 8044 9.1| Homo sapiens 685.2|G protein-coupled Id= 4/4 (100%), Positives = 4/4 (100%), ribosomal protein S24, receptor 34isoform 2 Gaps = 0/4 (0%) mRNA (cDNA clone gi|122892256|gb|ABM67195.Query 1 IMAGE: 3635225), 1| immunoglobulin heavy PNSG 4 Length = 3566chain variable region PNSG Score = 680 bits (343)gi|122939151|ref|NP_001073 Sbjct 225 Expect = 0.0 626.1| Rho GTPase PNSG228 Id = 343/343 (100%) activating protein 9 isoform Gaps = 0/343 (0%) 2Strand = Plus/Plus 38 10_C >gi|19913404|ref|NM_00 NSSA 4gi|124001558|ref|NP_689799. Score = 14.2 bits (26), Expect = 14482 43286.2| 3| ubiquitin specific Id = 4/4 (100%), Positives = 4/4 (100%),Homo sapiens protease 54 Gaps = 0/4 (0%) topoisomerase (DNA) Igi|122892053|gb|ABM67094. Query 1 (TOP1) 1| mutant lipase H NSSA 4 Score= 1003 bits (506) gi|120660404|gb|AAI30523.1| NSSA Expect = 0.0 Receptortyrosine kinase- Sbjct 2051 Id = 561/585 (95%) like orphan receptor 2NSSA 2054 Gaps = 1/585 (0%) gi|120660344|gb|AA130362.1| Strand= Plus/Plus BRCC2 gi|114432132|gb|AB174674.1| breast and ovarian cancersusceptibility protein 2 truncated variant 39 10_Ggi|62897624|dbj|AK22303 RRLKKFCIT 9 gi|60549585|gb|AAX24102.1| Score= 25.2 bits (52), Expect = 11 5 2.1| cation-transporting P5- Id = 8/11(72%), Positives = 8/11 (72%), Homo sapiens mRNA ATPase Gaps = 3/11(27%) for beta actin variant, gi|33469249|gb|AAQ19673.1| Query 1 clone:JTH03396 general transcription factor RRLKK---FCI 8 Length = 1820 II irepeat domain 2 RRLKK   FCI Score = 309 bits (156)gi|57209558|emb|CAI41998.1 Sbjct 459 Expect = 4e−82 melanoma associatedRRLKKRGIFCI 469 Id = 156/156 (100%) antigen (mutated) 1-like 1 Score= 23.5 bits (48), Expect = 36 Gaps = 0/156 (0%)gi|37932158|gb|AAP72185.1| Id = 6/6 (100%), Positives = 6/6 (100%),Strand = Plus/Minus migration-inducing protein 3 Gaps = 0/6 (0%)gi|13477103|gb|AAH05005.1| Query 3 Lung cancer-related protein 8 LKKFCI8 gi|6919941|sp|Q15113|PCOL LKKFCI C_HUMAN Sbjct 603 ProcollagenC-proteinase LKKFCI 608 enhancer protein precursor Score = 22.7 bits(46), Expect = 65 gi|61966767|ref|NP_0010136 Id = 6/6 (100%), Positives= 6/6 (100%), 82.1| stromal cell derived Gaps = 0/6 (0%) factor receptor2 homolog Query 1 gi|74759724|sp|Q8IYT8|ULK RRLKKF 6 2_HUMAN RRLKKFSerine/threonine-protein Sbjct 441 kinase ULK2 RRLKKF 446 42 1_D1gi|13097335|gb|BC00341 IFAGGGGP 26 gi|18204601|gb|AAH21228.1| Score= 31.6 bits (67), Expect = 2.9 2 8.1| AGPGRAGT FLJI0700 protein Id= 18/32 (56%), Positives = 18/32 (56%), Homo sapiens cyclic GGGRVASAgi|12644292|sp|P49840|GSK3 Gaps = 10/32 (31%) AMP phosphoprotein, TRA_HUMAN Glycogen Query 2 19 kD, mRNA (cDNA synthase kinase-3 alphaFAGG---GGP----AGPG---RAGTGGGRVAS 23 clone MGC: 5468 (GSK-3 alpha) FGG   G P    A PG   RAGTGGG VAS IMAGE: 3451558),gi|6291532|dbj|BAA86298.1| Sbjct475 complete cds Cbl-cFHGGQPTGAPLDCAAAPGAHYRAGTGGGPVAS 506 Length = 1228gi|46397888|sp|Q9ULV8|CB Score = 29.1 bits (61), Expect = 1.1 Score= 335 bits (169) LC_HUMAN Signal Id = 15/24 (62%), Positives = 17/24(70%), Expect = 4e−89 transduction protein CBL-C Gaps = 7/24 (29%) Id= 169/169(100%) gi|20149596|ref|NP_036248.2 Query 10 Gaps = 0/169(0%)Cas-Br-M (murine) GGGGP----AGPGRAGTGGGRVAS 29 Strand = Plus/Plusecotropic retroviral GGGGP    +GPG  GTGGG+ AS transforming sequence cSbjct 32 gi|56206355|emb|CAI18503.1 GGGGPGGSASGPG--GTGGGK-AS 52 HLA-Bassociated Score = 27.4 bits (57), Expect = 2.7 transcript 3 Id = 11/15(73%), Positives = 12/15 (80%) gi|5453736|ref|NP_005351.2| Gaps = 2/15(13%) v-maf musculoaponeurotic Query 3 fibrosarcoma oncogeneAGGGGPAGPGRAGTG 17 homolog isoform a AGGGGP GPG  G+Ggi|21759253|sp|O75444|MAF Sbjct 65 _HUMAN AGGGGPGGPG--GSG 77Transcription factor Maf (Proto-oncogene c-mat)gi|47078300|ref|NP_542159.2 apoptosis related protein 3 isoform bgi|66932999|ref|NP_006276.2 testis-specific protein kinase 1gi|2499660|sp|Q15569|TESK 1_HUMAN Dual-specificity testis- specificprotein kinase 1 44 10_D gi|3329373|gb|AF038952. VKIMTLKS 17gi|56417662|emb|CAI19057.1 Score = 23.1 bits (47), Expect = 53 121|AF038952 RQRSYKNP Novel protein Id = 7/8 (87%), Positives = 7/8 (87%),Homo sapiens cofactor G gi|16923934|gb|AAL31642.1| Gaps = 0/8 (0%) Aprotein mRNA, NOV1 Query 5 complete cds gi|40675561|gb|AAH64981.1|TLKSRQRS 12 Length = 574 Putative NEkB activating TLK RQRS Score = 392bits (198) protein Sbjct 152 Expect = 2e−106 gi|51701611|sp|Q9BX67|JAMTLKERQRS 159 Id = 202/204 (99%) 3_HUMAN Junctional Score = 23.1 bits(47), Expect = 53 Gaps = 0/204(0%) adhesion molecule c Id = 9/15 (60%),Positives = 12/15 (80%), Strand = Plus/Plus precursor (JAM-C) Gaps= 1/15 (6%) gi|38570359|gb|AAR24620.1| Query 1 Migration-inducing gene13 VKIMTLKSRQRSYKN 15 Deoxyhypusine synthase VK++TLK R+ S KN Sbjct 39VKLLTLKPRETS-KN 52 Score = 23.1 bits (47), Expect = 53 Id = 7/9 (77%),Positives = 8/9 (88%), Gaps = 0/9 (0%) Query 6 LKSRQRSYK 14 L+SRQR YKSbjct 260 LQSRQRDYK 268 45 4_G8 gi|56788032|gb|AY69246 RPARSRRM 14gi|57163116|emb|CAI39857.1 Score = 26.5 bits (55), Expect = 5.14.1| Growth-ihibiting MAWGKA Protein (peptidyl-prolyl Id = 7/8 (87%),Positives = 7/8 (87%), gene 46 mRNA cis/trans isomerase) NIMA- Gaps= 0/8 (0%) Length = 1715 interacting, 4 (parvulin) Query 1 Score = 823bits (415) gi|32171215|ref|NP_859066.1 RPARSRRM 8 Expect = 0.0Transducer of regulated RP RSRPM Id = 415/415 (100%) cAMP responseelement Sbjct 14 Gaps = 0/415 (0%) binding protein RPERSRRM 21 Strand= Plus/Minus gi|37574639|gb|AAQ93056.1| Score = 23.5 bits (48), Expect= 40 Antigen MLAA-10 Id = 6/7 (85%), Positives = 6/7 (85%),gi|59800455|sp|Q9UPY6|WA Gaps = 0/7 (0%) SF3_HUMAN Query 6Wiskott-Aldrich syndrome RRMMAWG 12 protein family member 3 RR MAWG(WASP-family protein Sbjct 144 member 3) RRTMAWG 150gi|17368482|sp|Q15773|MLF Score = 23.1 bits (47), Expect = 53 2_HUMAN Id= 8/12 (66%), Positives = 9/12 (75%), Myeloid leukemia factor 2 Gaps= 2/12 (16%) gi|10720185|sp|Q9Y4L1|OXR Query 2 P_HUMAN PARSR-RMM-AW 11150 kDa oxygen-regulated PA+R  RMM AW protein precursor (Orp150) Sbjct46 (Hypoxia up-regulated 1) PAQPRHRMMSAW 57 47 5_H8gi|33342277|ref|NM_0252 DWRVPR 36 gi|3647275|emb|CAA12110.1 Score = 30.8bits (65), Expect = 0.31 04.2| PAPHHR Matrilin-3 Id = 11/17 (64%),Positives = 11/17 (64%), Hypothetical protein LGARRLgi|3252872|gb|AAC24200.1| Gaps = 5/17 (29%) PP2447, mRNA PNLHAABRCA1-associated protein 2 Query 5 Score = 232 bits (117) PGRAAPgi|60390212|sp|Q969F8|KISS PRPAPHHRLGARRLPNL 21 Expect = 3e−58, RAASGLR_HUMAN PRPAP     ARRLP L Id = 117/117 (100%) KiSS-1 receptor (KiSS-1R)Sbjct 2 Strand = Plus/Plus (Metastin receptor) PRPAP-----ARRLPGL 13gi|21265064|ref|NP_620688.1 Score = 30.3 bits (64), Expect = 0.41 ADAMmetallopeptidase Id = 18/40 (45%), Positives = 18/40 (45%), withthrombospondin type 1 Gaps = 16/40 (40%) motif, 17 preproprotein 2gi|51094804|gb|EAL24050.1| WRVPRPAPHHRLGARRLPNLHAA-----PGRAAPRAASGL 36cAMP responsive element W  PRP        R LP L  A     PGRAA R AS L bindingprotein 3-like 2 566 Basic transcriptional factorW--PRP--------RALP-LRGAVGSCPPGRAAARGASDL 594 2 Score = 29.9 bits (63),Expect = 0.59 Id = 19/41 (46%), Positives = 19/41 (46%), Gaps = 17/41(41%) 2 WRVPRPAPHHRLGARRLPNLHAAPG-R---------AAPRA 32 WRV RP   H      LPL AAPG R         AAPRA 41 WRV-RPDDVH------LPPLPAAPGPRRRRRPRTPPAAPRA 7448 8_G1 Homo sapiens NSSCSE 6 gi|119629922|gb|EAX09517.1 Score = 19.3bits (38), Expect = 638 1 prothymosin, alpha PBX|knotted 1 homeobox 1,Id = 5/6 (83%), Positives = 6/6 (100%), (gene sequence 28), isoformCRA_a Gaps = 0/6 (0%) mRNA (cDNA gi|91992160|ref|NP_055196. Query 1clone MGC: 45641 mutL homolog 3 isoform 2- NSSCSE 6 IMAGE: 4335961),gi|6689928|gb|AAF23904.1|A +SSCSE complete cds F195657_1 DNA mismatchSbjct 323 Length = 1213 repair protein DSSCSE 328 Score = 97.6 bits(49), gi|119623371|gb|EAX02966.1 Score = 19.3 bits (38), Expect = 638Expect = 6e−18 scavenger receptor class F, Id = 5/6 (83%), Positives= 6/6 (100%), Identities = 49/49 (100%) member 2, isoform CRA b Gaps= 0/6 (0%) Gaps = 0/49 (0%) gi|6648106|sp|Q05086|UBE3 Query 1 Strand= Plus/Minus A_HUMAN Ubiquitin- NSSCSE 6 protein ligase E3A NS+CSE(Renal carcinoma antigen Sbjct 83 NY-REN-54) NSTCSE 88gi|11385658|gb|AAG34910.1| Score = 19.3 bits (38), Expect = 638AF273050_1 Id = 5/6 (83%), Positives = 6/6 (100%), CTCL tumor antigense37-2 Gaps = 0/6 (0%) gi|1872514|gb|AAB49301.1| Query 1 E6-associatedprotein E6- NSSCSE 6 AP|ubiguitin-protein ligase N+SCSE Sbjct 102 NNSCSE107 49 1_D2 gi|20357564|ref|NM_0001 RGDKLLHQ 27 gi|28892|emb|CAA35582.1|Score = 28.6 bits (60), Expect = 1.6 00.2| Homo sapiens GARRRRGL unnamedprotein product Id = 12/22 (54%), Positives = 14/22 (63%), cystatin B(stefin B) RTPASVPIS gi|2135396|pir||S71548 Gaps = 3/22 (13%) (CSTB),mRNA PS homeotic protein pG2- Query 6 Length = 674 humanLHQGARRRRGLRTPASVPISPS 27 Score = 232 bits (117)gi|38648796|gb|AAH63316.1| L QG RRR  L   +SVP+ PS Expect = 9e−59 FBXL17protein Sbjct 228 Id = 117/117 (100%) LQQGGRRRGDL---SSVPTAPS 246 Gaps= 0/117 (0%) Score = 28.2 bits (59), Expect = 2.1 Strand = Plus/Plus Id= 11/18 (61%), Positives = 11/18 (61%), Gaps = 5/18 (27%) Query 11RRRRGL-----RTPASVP 23 RRRR L     RTPA VP Sbjct 25 RRRRPLLRLPRRTPAKVP 4252 11_B gi|110624286|dbj|AK2258 RCSSXNRX 35 gi|3184264|gb|AAC18917.1|Score = 30.8 bits (65), Expect = 0.42 8 50.1| Homo sapiens GREGCPRSF02569_2 ID = 11/16 (68%), Positives = 13/16 (81%), mRNA for E1B-55 kDa-CGLRNESQ gi|113426694|ret|XP_001130 Gaps = 3/16 (18%) associated protein5 LHVARCWG 029.1| PREDICTED: Query 8 isoform a variant, clone: LPGhypothetical protein QG--REGCPRSCGLRNE 22 FCC125H07gi|34533723|dbj|BAC86786.1| QG  REGCP  CGLR++ Length = 3447 unnamedprotein product Sbjct 179 Score = 307 bits (155)gi|119568019|gb|EAW47634. QGAREGCP---CGLRHQ 192 Expect = 2e−811| fibronectin type III Score = 28.2 bits (59), Expect = 2.5 Id= 158/159 (99%) domain containing 1 Identities = 14/22 (63%), Positives= 15/22 (68%) Gaps = 0/159 (0%) Gaps = 5/22 (22%) Strand = Plus/PlusQuery 7 RQG-REGCPRSCGLRNESQLHV 27 R G REGCPR C  R +S LHV Sbjct 33RPGLPREGCPR-C--R-QSVLHV 50 Score = 27.8 bits (58), Expect = 3.3 Id= 9/12 (75%), Positives = 10/12 (83%), Gaps = 1/12 (8%) Query 7RQGREG-CPRSC 17 RQGREG C R+C Sbjct 1234 RQGREGACHRAC 1245 55 2_G7gi|48734768|gb|BC07192 ERPSKRYL 24 gi|51476730|emb|CAH18336. Score= 25.2 bits (52), Expect = 12 8.1| Homo sapiens HQAAGGGE 1| hypotheticalprotein Id = 12/23 (52%), Positives = 13/23 (56%), ribosomal proteinS17, RKERQLCS gi|30354570|gb|AAH51765.1| Gaps = 9/23 (39%) mRNA (cDNAclone Nuclear factor kappa-B, Query 7 MGC: 88613 subunit 1YLHQAAGGGER------KE--RQ 21 IMAGE: 5090053), gi|23271363|gcb|AAH35512.1|YL QA GGG+R      KE  RQ complete cds WD repeat domain 49 Sbjct 177Length = 488 gi|31621301|ref|NP_853550.1 YL-QAEGGGDRQLGDREKELTRQ 198Score = 476 bits(240) regulator of G-protein Score = 25.2 bits (52),Expect = 12 Expect = 7e−132 signalling like 1 Id = 12/23 (52%),Positives = 13/23 (56%), Id = 268/270 (99%) gi|40254949|ref|NP_065960.2Gaps = 9/23 (39%) Gaps = 1/270(0%) erythrocyte membrane Query 7 Strand= Plus/Plus protein band 4.1 like 5 YLHQAAGGGER------KE--RQ 21gi|14917040|sp|Q15127|SCA YL QA GGG+R      KE  RQ M2_HUMAN Sbjct 177Secretory carrier-associated YL-QAEGGGDRQLGDREKELTRQ 198 membraneprotein 2 Score = 24.0 bits (49), Expect = 29 gi|57162523|emb|CA140407.1Id = 8/16 (50%), Positives = 9/16 (56%), transcription factor 7-like 2Gaps = 5/16 (31%) (T-cell specific, HMG-box) Query 8 LHQAAGGGERKERQLC 23LH      ERK +QLC Sbjct 675 LHH-----ERKAKQLC 685 58 4_G7gi|71559137|ref|NM_0010 SSQGHF 6 gi|21753429|dbj|BACO4343.1| Score= 21.8 bits (44), Expect = 78 29.3| unnamed protein product Id = 6/6(100%), Positives = 6/6 (100%), Homo sapiens ribosomal Roquin (RINGfinger and Gaps = 0/6 (0%) protein S26 (RPS26), C3H zinc fingerprotein 1) Query 1 mRNA gi|55960182|emb|CAI17337.1 SSQGHF 6 Length = 699G protein-coupled receptor SSQGHF Score = 367 bits (185) 123 Sbjct 74Expect = 1e−98 gi|22749391|ref|NP_689910.1 SSQGHF 79 Id = 202/210 (96%)solute carrier family 44, Score = 19.3 bits (38), Expect = 619 Gaps= 0/210 (0%) member 5 Id = 5/5 (100%), Positives = 5/5 (100%), Strand= Plus/Plus gi|55960275|emb|CAI14751.1 Gaps = 0/5 (0%) novel MAM domainQuery 2 containing protein SQGHF gi|73918937|sp|Q8NCS7|CTL Sbjct 745_HUMAN Choline SQGHF 78 transporter-like protein Score = 18.0 bits(35), Expect = 1097 Ribosomal protein S6 Id = 5/5 (100%), Positives= 5/5 (100%), kinase 2 (S6K2) Gaps = 0/5 (0%) (Serine/threonine-proteinQuery 1 kinase 14 beta) SSQGH 5 gi|55664933|emb|CAH71148. SSQGH1| colony stimulating factor Sbjct 189 1 (macrophage) SSQGH 193 59 1_H2gi|39992414|gb|BC06441 GSGKIKKSV 20 gi|19923432|ref|NP_054876.2 Score= 28.2 bits (59), Expect = 2.1 8.1| Homo sapiens LWDRKVGI coiled-coildomain Id = 8/12 (66%), Positives = 11/12 (91%), FK506 binding proteinRKN containing 113 Gaps = 1/12 (8%) 9,63 kDa, mRNAgi|12053083|emb|CAB66719. Query 7 (cDNA clone 1| hypothetical proteinKSV-LNDRKVEI 17 IMAGE: 5750487), gi|115298682|ref|NP_055987.KS+ +W+RKVEI partial cds 2| HBxAg transactivated Sbjct 336 Length = 2421protein 2 KSIRMWERKVEI 347 Score = 486 bits (245)gi|119594637|gb|EAW74231. Score = 24.8 bits (51), Expect = 23 Expect= 1e−134 1| phospholipase C, beta 3 Id = 10/15 (66%), Positives = 10/15(66%), Id = 259/264 (98%) (phosphatidylinositol- Gaps = 0/15 (0%) Gaps= 0/264 (0%) specific) Query 2 Strand = Plus/Minusgi|119590306|gb|EAW69900. SGKIKKSVLWDRKVE 16 1| nucleoporin 133 kDa SGIKK VL D K E Sbjct 1013 SGPIKKPVLRDMKEE 1027 Score = 22.7 bits (46),Expect = 98 Id = 7/8 (87%), Positives = 7/8 (87%) Gaps = 0/8 (0%) Query10 LWDRKVGI 17 L DRKVGI Sbjct 671 LSDRKVGI 678 60 9_G7gi|119380763|gb|EF17744 ARRWSRST 15 gi|10732604|gb|AAG22468.1| Score= 24.8 bits (51), Expect = 23 7.1| LCRSICL AF193040_1 uknown Id = 7/8(87%), Positives = 7/8 (87%), Homo sapiens isolategi|119584289|gb|EAW63885. Gaps = 0/8 (0%) TA23 mitochondrion, 1| tRNAnucleotidyl Query 2 complete genome transferase, CCA-adding, 1, RRWSRSTL9 Length = 16569 isoform CRA_a RRWS STL Score = 424 bits (214)gi|68566213|sp|Q8NFZ6|VN1 Sbjct 187 Expect = 2e−116 R2_HUMAN RRWSPSTL194 Id = 214/214 (100%) Vomeronasal type-1 Score = 24.0 bits (49),Expect = 41 Gaps = 0/214 (0%) receptor 2 Id = 7/9 (77%), Positives = 7/9(77%), Strand = Plus/Plus (Vlr-Iike receptor 2) Gaps = 2/9 (22%)(hGPCR25) Score = 22.3 bits (45), Expect = 133 Putative G-proteincoupled Id = 8/10 (80%), Positives = 8/10 (80%), receptor Gaps = 1/10(10%) Query 5 SRSTLCRSIC 14 SRS LC SIC Sbjct 373 SRSRLC-SIC 381 61 10_Hgi|37704380|ref|NM_0040 LIQHQHLG 10 Zinc finger protein 566 Score = 23.1bits (47), Expect = 74 8 48.2| QI gi|32470620|sp|Q9Y6K5|QAS Id = 6/7(85%), Positives = 7/7 (100%). Homo sapiens beta-2- 3 HUMAN Gaps = 0/7(0%) microglobulin (B2M), 2′-5′-oligoadenylate Query 1 mRNA synthetase 3(p100 OAS) LIQHQHL 7 Length = 987 gi|73620903|sp|Q5SZK8|FRE LIQHQ+LScore = 246 bits (124) M2_HUMAN Sbjct 410 Expect = 2e−62 FRAS1-relatedextracellular LIQHQNL 416 Id = 124/124 (100%) matrix protein 2 precursorScore = 22.7 bits (46), Expect = 99 Gaps = 0/124 (0%)gi|21263627|sp|Q9NQZ8|ZNF Id = 7/8 (87%), Positives 7/8 (87%), Strand= Plus/Plus 71_HUMAN Gaps = 1/8 (12%) Endothelial zinc finger Query 1protein induced by tumor LIQ-HQHL 7 necrosis factor alpha LIQ HQHL (Zincfinger protein 71) Sbjct 258 Probable ATP-dependent LIQQHQHL 265helicase DDX41 Score = 21.4 bits (43), Expect = 174 DEAD-box protein 41Id = 7/9 (77%), Positives = 8/9 (88%) Gaps = 0/9 (0%) Query 1 LIQHQHLGQ9 LIQ+ HLGQ Sbjct 1379 LIQYVHLGQ 1367 64 2_D6 Homo sapiens actin, RPHCEL23 gi|119533|sp|P04626|ERBB2_ Score = 25.2 bits (52) Expect = 15 gamma1, mRNA (cDNA WGMLAP HUMAN Identities = 6/6 (100%), clone IMAGE:3461395) TDCCHL Receptor tyrosine-protein Positives = 6/6 (100%) Score= 172 bits (87) HRSSF kinase erbB-2 precursor Query 14 Expect = 2e−40(p185erbB2)(C-erbB-2) PTDCCH 19 Identities = 87/87 (100%) (NEUproto-oncogene) PTDCCH gi|20139132|sp|Q9BZF2|OSR Sbjct: 232 7_HUMANPTDCCH 237 Oxysterol binding protein- Score = 15.9 bits (30) Expect= 9714, related protein 7 Identities = 5/6 (63%),gi|13633990|sp|Q9NQE7|TSS Positives = 5/6 (63%) P_HUMAN Thymus-specificQuery: 19 serine protease precursor HLHRSS 24 gi|29428029|sp|Q9NW08|RP HHRSS C2_HUMAN DNA-directed Sbjct: 1045 RNA polymnerase III HRHRSS 1050subunit 127.6 kDa Score = 24.0 bits (49), Expect = 38 polypeptide Id= 8/12 (66%), Pos = 8/12 (66%), gi|2811086|sp|P00533|EGFR_ Gaps = 0/12(0%) HUMAN Query 10 Epidermal growth factor LAPTDCCHLHRS 21 receptorprecursor L PTD  HL RS (Receptor tyrosine-protein Sbjct 291 kinaseErbB-1) LPPTDYAHLQRS 302 Score = 23.5 bits (48), Expect = 51 Id = 6/7(85%), Positives = 7/7 (100%), Gaps = 0/7 (0%) Query 6 LWGMLAP 12LWG+LAP Sbjct 17 LWGLLAP 23 67 8_G3 gi|71373270|gb|D013740 PHNTFSAYP 30gi|24419041|gb|AAL65133.2| Score = 27.4 bits (57), Expect = 2.7 8.1|ECPDVTRT ovarian cancer related Id = 12/24 (50%), Positives = 15/24(62%), Homo sapiens isolate TPMHTPHE tumor marker CA 125 Gaps = 6/24(25%) UV122 mitochondrion, TSYHL gi168566146.sp.Q9NQW7|XP Query 2complete genome P1_HUMAN HNTFSAYPE-----CPDVTRTTPM 20 Length = 16568Xaa-Pro aminopeptidase 1 H+T SAYPE      P+VT T+ M Score = 496 bits (250)(Soluble aminopeptidase P) Sbjct 4847 Expect 2e−137gi|32699681|sp|Q9H4B6|SAV HSTVSAYPEPSKVTSPNVT-TSTM 4869 Id = 250/250(100%) 1_HUMAN Score = 22.3 bits (45), Expect = 91 Gaps = 0/250 (0%)Salvador hoinolog 1 protein Id 10/18 (55%), Positives = 11/18 (61%)Strand = Plus/Plus (45 kDa WW domain Gaps = 5/18 (27%) /gene = “COX1”protein) (hWW45) Query 13 /product = “cytochrome cgi|2230414|sp|O15067|PUR4 DVTRTTPMHTPH--ETSY 28 oxidase subunit 1”_HUMAN DVT T+P   P   ETSY Phosphoribosylformylglycin Sbjct 4440 amidinesynthase (FGAM DVTWTSP---PSVAETSY 4454 synthase) Score = 20.6 bits (41),Expect = 296 gi|39793978|gb|AAH63538.1| Id = 9/16 (56%), Positives= 10/16 (62%), PFAS protein Gaps = 2/16 (12%) gi|8134719|sp|Q92966|SNPCQuery 7 3_HUMAN AYPECPDVTRTTPMHT 22 snRNA activating protein AY E PV  T+PM T complex 50 kDa subunit Sbjct 6614 (Proximal sequenceAYSEPPRV--TSPMVT 6627 element-binding Score = 26.9 bits (56), Expect= 4.9 transcription factor beta Id = 11/18 (61%), Positives = 12/18(66%), subunit) Gaps = 6/18 (33%) (PSE-binding factor beta Query 13subunit) (PTF beta subunit) DVTRTTPMH--TPHETSY 28gi|55960781|emb|CAI16349.1 DVTRT  MH  TP  T+Y hepatoma-derived growthSbjct 426 factor (high-mobility group DVTRT--MHFGTP--TAY 439 protein1-like) Score = 26.1 bits (54), Expect = 11 Id = 16/36 (44%), Positives= 19/36 (52%), Gaps = 11/36 (30%) Query 3PNSSPHNTFSAYPECPDV-TRT-----TPMH-TPHE 31 P+SSP N FS      DV+R      TP+  TPHE Sbjct 53 PDSSP-NAFST---SGDVVSRNQSFLRTPIQRTPHE 84 Score= 26.1 bits (54), Expect = 11 Id = 10/15 (66%), Positives = 11/15 (73%),Gaps = 1/15 (6%) Query 12 SAYPECPDVTRT-TP 25 SAY  CPD+T T TP Sbjct 826SAYAVCPDITATVTP 840 Score = 17 .6 bits (34), Expect = 2315 Id = 9/21(42%), Positives = 10/21 (47%), Gaps = 8/21 (38%) Query 10ECPDVTRT-------TPMHTP 23 ECP V R        TP+ TP Sbjct 15ECP-VRRNGQGDAPPTPLPTP 34 68 1_H7 gi|30584378|gb|BT00777 GRVIQE 36gi|28892|emb|CAA35582.1| Score = 28.6 bits (60), Expect = 1.40.1| Synthetic construct PGGRGD unnamed protein product Identities= 12/22 (54%), Positives = 14/22 (63%) Homo sapiens cystatin B KLLHQGgi|2135396|pir||S71548 Gaps = 3/22 (13%) (stefin B) mRNA ARRRRG homeoticprotein pG2- Query 15 Score = 280 bit(141) LRTPAS humanLHQGARRRRGLRTPASVPISPS 36 Expect = 1e−72 VPISPSgi|48428276|sp|O43432|IF4G L QG RRR L    +SVP +PS Id = 141/141 (100%)3_HUMAN Sbjct 227 Strand = Plus/Plus Eukaryotic translationLQQGGRRRGDL---SSVPTAPS 245 initiation factor 4 gamma 3 Score = 28.6 bits(60), Expect = 1.3 (elF-4-gamma 3) Identities = 12/22 (54%), Positives= 14/22 (63%), (eIF-4G 3) (eIF4G 3) (eIF-4- Gaps = 3/22 (13%) gamma II)(eIF4GII) Query: 15 gi|2190402|emb|CAA73944.1| LHQGARRRRGLRTPASVPISPS 36latent TCF-beta binding L QG RRR  L   +SVP +PS protein-4 Sbjct: 228LQQGGRRRGDL---SSVPTAPS 246 Score = 26.9 bits (56), Expect = 4.3Identities = 15/28 (53%), Positives = 17/28 (60%) Gaps = 7/28 (25%)Query: 1 GRVIQEPGGRGDKLLHQGARR-------RR 23 GR  Q  PGGRGLL+ G+RR       RR Sbjct: 685 GR----QTPGGRGVPLLNVGSRRSQPGQRR 710 69 2_C8gi|399924141|gb|BC06441 GSGKIKKSV 20 gi|56202484|emb|CA121939.1 Score= 24.0 bits (49), Expect = 29 8.1| Homo sapiens LWDRKVGI HBxAgtransactivated Id = 9/13 (69%), Positives = 9/13 (69%), FK506 bindingprotein RKN protein 2 (XTP2) Gaps = 0/13 (0%) 9,63 kDa, mRNAgi|119611307|gb|EAW90901. Query 2 (cDNAcloneIMAGE: 575 1| BAT2 domaincontaining SGKIKKSVLWDRK 14 0487), partial cds 1, isoform CRA_d SG IKKVL D K Length = 2421 gi|29427863|sp|Q8WUM0|N Sbjct 83 Score = 486 bits(245) U133_HUMAN SGPIKKPVLRDMK 95 Expect = 1e−134 Nuclear pore complexScore = 24.0 bits (49), Expect = 41 Id = 259/264 (98%) protein Nup133 Id= 6/6 (100%), Positives = 6/6 (100%), Gaps = 0/264 (0%)gi|61216828|sp|Q96HF1|SFR Gaps = 0/6 (0%) Strand = Plus/Minus P2_HUMANQuery 8 Secreted apoptosis - related SVLWDR 13 protein 1 SVLWDRgi|12803735|gb|AAH02704.1 Sbjct 101 Signal transducer and SVLWDR 106activator of transcription 1 Score = 23.1 bits (47), Expect = 73gi|29337296|ref|NP_803881.1 Id = 6/6 (100%), Positives = 6/6 (100%),Melanoma antigen family D, Gaps = 0/6 (0%) 4 isoform 2 Query 6gi|57012811|sp|Q7Z41I9|IBR KKSVLW 11 D2_HUMAN KKSVLW E3 ubiquitin ligaseIBRDC2 Sbjct 233 (p53- inducible RING finger KKSVLW 238 protein)Phospholipase C beta 3 70 12_D gi|24430161|ref|NM_0003 TQLV 4gi|38569484|ref|NP_060111.2 Score = 15.5 bits (29), Expect = 427804.2| Homo sapiens kinesin family member 21A Identities = 4/4 (100%),Positives = 4/4 (100%) peripheral myelin gi|21265037|ref|NP_055058.1Gaps = 0/4 (0%) protein 22 (PMP22), ADAM metallopeptidase Query 1transcript with throinbospondin type I TQLV 4 variant 1, mRNA motif, 3proprotein TQLV Length = 1828 gi|57209716|emb|CA140838.1 Sbjct 78 Score= 357 bits (180), cadherin related 23 TQLV 81 Expect = 3e−96gi|56204273|emb|CAI19274.1 Id = 208/219 (94%) retinoblastoina-associatedGaps = 1/219 (0%) factor 600 (RBAF600) Strand = Plus/Minusgi|55962143|emb|CAI15704.1 calsyntenin 1 gi|42525235|gb|AAS18317.1| CDK5regulatory subunit associated protein 1 transcript variant 3 73 2_H3gi|125541850|gb|EF40865 GVSVNEAS 16 gi|58202610|gb|AAW67356.1 Score= 27.8 bits (58), Expect = 2.9 6.1| Homo sapiens YDGKYSSY randomintestinal-homing Id = 9/14 (64%), Positives = 10/14 (71%), haplotype H7antibody heavy chain Gaps = 0/14 (0%) mitochondrion, variable regionQuery 3 complete genoine gi|60326824|gb|AAX18924.1| WVNEASYDGKYSSY 16Length = 16569 anti-TARP (novel breast WV   SYDGKY +Y Score = 1199 bits(605), and prostate tumor- Sbjct 49 Expect = 0.0 associated antigen)WVAVISYDGKYENY 62 Id = 665/677 (98%) immnnoglobnlin heavy Score = 25.2bits (52), Expect = 17 Gaps = 0/677 (0%) chain Id = 8/11 (72%),Positives 8/11 (72%), Strand = Plus/Minus gi|13161090|gb|AAK134791| Gaps= 0/11 (0%) /product = “NADH AF332227_1 heat shock Query 3 dehydrogenasesubunit transcription factor 2-like WVNEASYDGKY 13 1” proteinWV   SYDGKY gi|119602637|gb|EAW82231. Sbjct 47 1| eukaryotic translationWVAVISYDGKY 57 elongation factor 1 delta Score = 24.4 bits (50), Expect= 31 gi|10719919|sp|Q15326|ZMY Id = 7/8 (87%), Positives = 7/8 (87%),11_HUMAN Gaps = 0/8 (0%) (Adenovirus 5 E1A-binding Query 2 protein)VSVNEASY 9 VSVNEA Y Sbjct 352 VSVNEAPY 359 75 1_G3gi|71483115|ref|NM_0010 GAAQPR 28 gi|34526547|dbj|BAC85151.1| Score= 29.9 bits (63), Expect = 0.47 24.3| Homo sapiens NAERRR FLJ00336protein Id = 10/13 (76%), Positives 10/13 (76%) ribosomal protein S21RVRGPV gi|37590807|gb|AAH59399.1| Gaps = 1/13 (7%) (RPS21), mRNA RAAEMLSolute carrier family 27 Query 7 Length = 418 R (fatty acid transporter)QP-RNAERRRVR 18 Score = 383 bits (193) gi|3139079|gb|AAC36682.1| QP RAERR RVR Expect = 6e−14 cullin 3 Sbjct 347 Id = 202/206 (98%)gi|51095155|gb|EAL24398.1| QPVREAERRHRVR 359 Gaps = 0/206 (0%)hypothetical protein Score = 29.5 bits (62), Expect = 0.63 Strand= Plus/Plus FLJ36031 Id = 11/18 (61%), Positives = 13/18 (72%) /product= “ribosomal gi|7018298|emb|0AB75612.1| Gaps = 1/18 (5%) protein S21”c394H11.1 (similar to SOX Query 11 (SRY (sex determiningAERRRAVRGPVRAA-EML 27 region Y))) AERA R RG +R A +MLgi|22261792|sp|O60312|AT10 Sbjct 173 A_HUMAN AERRSRSRGAIRNACQML 190Probable phospholipid- Score = 28.6 bits (60), Expect = 1.1 transportingATPase VA Id = 12/17 (70%), Positives = 12/17 (70%)gi|14009443|dbj|BAB47392.1| Gaps = 2/17 (11%) putative aminophospholipidQuery 8 translocase PRNAERRRRVRGPVRAA 24 gi|26986715|gb|AAN86723.1| PR ARRRR RG  RAA gi|54673563|gb|AAH37322.3| Sbjct 122 Transcription factorMLR1 PRAAARRRR-RG-ARAA 136 gi|1169723|sp|P41439|FOLR3 _HUMAN Folatereceptor gamma precursor (FR-gamma) gi|55957252|emb|CA112668. pancreasspecific transcription factor, 1a gi|11493712|gb|AAG35617.1| homeoboxtranscription factor gi|57242761|ref|NP_003795.2 receptor (TNFRSF)-interacting serine-threonine kinase 1 gi|60393639|sp|Q3546|RIPK1_HUMANReceptor- interacting serine| threonine-protein kinase 2 (Celldeath protein RIP) 76 11_B gi|18088464|gb|BC02116 NSSHH 5gi|119616592|gb|EAW96186. Score = 19.3 bits (38), Expect = 532 107.1| Homo sapiens 1| killer cell lectin-like Id = 5/5(100%), Positives= 5/5(100%), prostaglandin E receptor subfamily C, Gaps = 0/5(0%)synthase 3 (cytosolic), member 3, isoform CRA_c Query 1 mRNA (cDNA clonegi|119618016|gb|EAW97610. NSSHH 5 IMAGE: 3683159), 1| apoptoticpeptidase NSSHH partial cds activating factor Sbjct 137 Length = 1057gi|119630109|gb|EAX09704.1 NSSHH 141 Score = 420 bits (212)dual-specificity tyrosine- Expect = 3e−115 (Y)-phosphorylation Id= 212/212(100%) regulated kinase Gaps = 0/2 12(0%) 1A, isoform CRA_cStrand = Plus/Plus gi|119580009|gb|EAW59605. 1| SW1/SNF related, matrixassociated, actin dependent regulator of chromatin 77 5_C3gi|98162806|ref|NM_0174 LGTSTAQQ 28 gi|119592169|gb|EAW71763. Score= 27.4 bits (57), Expect = 3.8 32.3| Homo sapiens HGGWCPEA1| hCG1794614, isoform Id = 15/26 (57%), Positives = 16/26 (61%),prostate tumor SKDGPHPST gi|21750918|dbj|BAC03866.1| Gaps = 9/26 (34%)overexpressed gene 1 FPQ unnamed protein product Query 8 (PTOV1), mRNAgi|4877793|gb|AAD31434.1| QHGGWCPEASKD--GPHP---STFPQ 28 Length = 1884DNA methyltransferase 3 QHGG CP  +K   GPHP   ST PQ Score = 206 bits(104), beta 5 Sbjct 13 Expect = 5e−51 gi|62898391|dbj|BAD97135.1QHGG-CP--AKALPGPHPGVVST-PQ 34 Id = 104/104(100%), BCL2-like 14 isoform 1Score 27.4 bits (57), Expect = 3.8 Gaps = 0/104(0%)gi|56748611|sp|Q9BZR8|B2L Id = 7/7 (100%), Positives = 7/7 (100%),Strand = Plus/Plus 14_HUMAN Apoptosis Gaps = 0/7 (0%) facilitatorBcl-2-like 14 Query 11 protein GWCPEAS 17 gi|118572673|sp|O14513|NAGWCPEAS P5_HUMAN Sbjct 131 Nck-associated protein 5 GWCPEAS 137(Peripheral clock protein) Score = 25.2 bits (52), Expect = 16gi|13633942|sp|Q9UM82|SPA Id = 8/10 (80%), Positives = 8/10 (80%),T2_HUMAN Gaps = 1/10 (10%) Spermatogenesis-associated Query 8 protein 2QHGGWW-PEA 16 QHG WC PEA Sbjct 145 QHGPWCPPEA 154 79 11_Cgi|18088638|gb|BC02088 DCSC 4 gi|119615430|gb|EAW95024. Score = 17.6bits (34), Expect = 1378 9 9.1| Homo sapiens 1| EPH receptor B2 Id = 4/4(100%), Positives = 4/4 (100%) cDNA clone gi|119599812|gb|EAW79406. Gaps= 0/4 (0%) IMAGE: 4692359 1| integrin, beta 5 Query 1 Length = 1436gi|119583687|gb|EAW63283. DCSC 4 Score = 466 bits (235), 1| ADAMmetallopetidase DCSC Expect = 4e−129 domain 9 (meltrin gamma) Sbjct 216Id = 247/253 (97%), gi|115502238|sp|Q9Y2K2|QS DCSC 219 Gaps = 0/253 (0%)K_HUMANSerine|threonine -protein kinase QSK gi|119603107|gb|EAW82701.1|phosphorylase kinase, beta, isoforin CRA_b 80 5_H4gi|34304330|ref|NM_1830 NRSRWICG 17 gi|13124092|sp|Q9UBT3|DK Score= 24.0 bits (49), Expect = 29 63.1| PLGLIKALV K4_HUMAN Dikkopf- Id= 8/13 (61%), Positives = 9/13 (69%) Homo sapiens ring related protein 4precursor Gaps = 3/13 (23%) finger protein 7 (RNF7), (Dkk-4) (24) Query5 transcript variant 2, gi|51477372|ref|XP_293360.4 WICGPLGLIKALV 17mRNA PREDICTED: Similar to W+C PLG   ALV Length = 1880 Ubiquitin ligaseSIAH1 Sbjct 11 Score = 708 bits (357) Seven in absentia homologWLCSPLG---ALV 20 Expect = 0.0 1-like Score = 22.3 bits (45), Expect = 95Id = 365/369 (98%) gi|48146981|emb|CAG3 Id = 7/9 (77%), Positives = 8/9(88%) Gaps = 0/369 (0%) 3713.1| Gaps = 0/9 (0%) Strand = Plus/Plus NOV[Homo sapiens] Query 8 gi|1352515|sp|P48745|NOV_ GPLGLIKAL 16 HUMAN NOVprotein GPLGLI+ L homolog precursor (NovH) Sbjct 72 (NephroblastomaGPLGLIRNL 80 overexpressed gene protein Score = 22.3 bits (45), Expect= 129 homolog) Id = 5/5 (100%), Positives = 5/5 (100%),gi|55925626|ref|NP_0010072 Gaps = 0/5 (0%) 54.1| endogenous retroviralQuery 5 sequence 3 WICGP 9 gi|44887883|sp|Q14264|ENR WICGP 1_HUMAN HERV-Sbjct 419 R_7q21.2 provirus ancestral WICGP 423 Env polyproteinprecursor (Envelope polyprotein) 83 2_H7 gi|4506788|ref|NM_00297RKGKKSKR 14 gi|8924242|ref|NP_061136.1| Score = 25.7 bits (53), Expect= 9.1 0.1| RKWLNS sarcoma antigen 1 Id = 10/20 (50%), Positives = 12/20(60%), Homo sapiens gi|8216987|emb|CAB92443.1| Gaps = 7/20 (35%)spermidine/spermine putative tumor antigen Query 1 N1-acetyltransferasegi|51458801|ref|XP_089384.7 RKGKK---SKRRK----WLN 13 (SAT), mRNA| PREDICTED: similar to RK K+   SKRRK    WL+ Length = 1060 RIKEN cDNAA430025D11 Sbjct 51 Score = 1031 bits (520) gi|23396625|sp|Q9NQT8|K11RKSKRHSSSKRRKSMSSWLD 70 Expect = 0.0 3B_HUMAN Kinesin-like Score = 25.2bits (52), Expect = 12 Id = 520/520 (100%) protein K1F13B Id = 7/9(77%), Positives = 8/9 (88%), Gaps = 0/520 (0%)gi|57013078|sp|Q7Z699|SPR Gaps = 0/9 (0%) Strand = Plus/Plus E1_HUMANSprouty- Query 4 related, EVH1 domain KKSKRRKWL 12 containing protein 1KK K+RKWL gi|31565770|gb|AAH53600.1| Sbjct 22 Transmembrane and coiled-KKKKKRKWL 30 coil domains 4 Score = 24.4 bits (50), Expect = 30gi|56202682|emb|CA120092.1 Id = 7/7 (100%), Positives = 7/7 (100%),novel protein Gaps = 0/7 (0%) gi|7227890|sp|O24175|FL_O Query 4 RYSAPutative KKSKRRK 10 transcription factor FL KKSKRRKgi|51701343|sp|Q8TD10|CHD Sbjct 60 5_HUMAN Chromodomain KKSKRRK 66helicase-DNA-binding Score = 24.8 bits (51), Expect = 22 protein5(CHD-5) Id = 6/6 (100%), Positives = 6/6 (100%)gi|126178|sp|P14151|LYAM1 Gaps = 0/6 (0%) _HUMAN L-selcctin Query 8precursor (Lymph node RRKWLN 13 homing receptor) RRKWLN (Leukocyteadhesion Sbjct 1092 molecule 1) (gp90-MEL) RRKWLN 1097gi|18378735|ref|NP_055625.2 Score = 24.4 bits (50), Expect = 22centrosome-associated Id = 8/11 (72%), Positives = 8/11 (72%), protein350 Gaps = 1/11 (9%) Query 1 RKGKKSKRRKW 11 RK KK  RRKW Sbjct 178RKKKENRRKW 187 85 9_C3 gi|92444107|gb|BC01388 GDPNSS 6gi|119615660|gb|EAW95254. Score 18.5 bits (36), Expect = 1148 7.1| Homosapiens 1| cytoplasmic linker Id = 5/5 (100%), Positives = 5/5 (100%),mRNA similar to mouse associated Gaps = 0/5 (0%) double minute 2, humangi|119607399|gb|EAW86993. Query 1 homolog of; p53-binding 1| telomericrepeat binding GDPNS 5 protein (cDNA clone factor (NIMA-interacting) 1GDPNS IMAGE: 3841679) gi|119588279|gb|EAW67873. Sbjct 177 Length = 38651| protein tyrosine GDPNS 181 Score = 835 bits (421) phosphatase,receptor typeJ Score = 18.5 bits (36), Expect = 1148 Expect = 0.0gi|119626399|gb|EAX05994.1 Id = 5/5 (100%), Positives = 5/5 (100%), Id= 423/424 (99%) dentin sialophosphoprotein Gaps = 0/5 (0%) Gaps = 0/424(0%) gi|119613825|gb|EAW93419. Query 2 Strand = Plus/Plus 1| TNFreceptor-associated DPNSS 6 factor 5 DPNSS gi|116242829|sp|Q9Y4A5|TRSbjct 2031 RAP_HUMANTransformation| DPNSS 2035 transcription domnain-associated protein 87 2_C2 gi|30584378|gb|BT00777 GRVIQE 36gi|28892|emb|CAA35582.1| Score = 28.6 bits (60), Expect = 1.40.1| Synthetic construct PGGRGD unnamed protein product Identities= 12/22 (54%), Positives = 14/22 (63%), Homo sapiens cystatin B KLLHQCgi|2135396|pir||S71548 Gaps = 3/22 (13%) (stefin B) mnRNA ARRRRGhomeotic protein pG2 Query 15 Score = 280 bit(141) LRTPASgi|48428276|sp|O43432|IF4G LHQGARRRRGLRTPASVPISPS 36 Expect = 1e−72VPISPS 3_HUMAN L QG RRR L   +SVP +PS Id = 141/141 (100%) Eukaryotictranslation Sbjct 227 Strand = Plus/Plus initiation factor 4 gamma 3LQQGGRRRGDL--SSVPTAPS 245 (eIF-4-gamma II) (eIF4GII) Score = 28.6 bits(60), Expect = 1.3 gi|2190402|emb|CAA73944.1| Identities = 12/22 (54%),Positives = 14/22 (63%), latent TGF-beta binding Gaps = 3/22 (13%)protein-4 Query: 15 LHQGARRRRGLRTPASVPISPS 36 L QG RRR  L   +SVP +PSSbjct: 228 LQQGGRRRGDL---SSVPTAPS 246 Score = 26.9 bits (56), Expect= 4.3 Identities 15/26 (53%) Positives = 17/28 (60%), Gaps = 7/28 (25%)Query: 1 GRVIQEPGGRGDKLLHQGARR-----RR 23 GR  Q PGGRG  LL+ G+RR     RRSbjct: 685 GR--QTPGGRGVPLLNVGSRRSQPGQRR 710 90 10_Fgi|125656973|gb|EF06115 VGNGEG 11 gi|20177960|sp|Q96JB6|LOX Score = 24.8bits (51), Expect = 21 11 0.1| RLEVL L4_HUMAN Id = 7/7 (100%), Positives= 7/7 (100%), Homo sapiens isolate Lysyl oxidase homolog 4 Gaps = 0/7(0%) UV0758 mitochondrion, gi|4826914|ref|NP_005081.1| Query 11 completegenome phospholipase A2, group IVB EGRLEVL 17 Length = 16573gi|3811347|gb|AAC78836.1| EGRLEVL Score = 208 bits (105) cytosolicphospholipase A2 Sbjct 43 Expect = 3e−51 gi|2745961|gb|AAB94793.1|EGRLEVL 49 Id = 112/115 (97%) Bcd orf2 Score = 24.4 bits (50), Expect= 22 Gaps = 0/115 (0%) gi|5453978|ref|NP_006250.1| Id = 7/7 (100%),Positives = 7/7 (100%), Strand = Plus/Minus protein kinase, cGMP- Gaps= 0/7 (0%) /product = “NADH dependent type II Query 4 dehydrogenasesubunit gi|51473081|ref|XP_372641.2 GEGRLEV 10 1” Spermatogenesisassociated GEGRLEV protein PD1 Sbjct 349 gi157282312|emb|CAD43180.GEGRLEV 355 3| interleukin-1 receptor Score = 22.3 bits (45), Expect= 97 associated kinase-2 Id = 6/8 (75%), Positives = 8/8 (100%),gi|20177834|sp|O43866|CD5 Gaps = 0/8 (0%) L_HUMAN CD5 antigen- Query 4like precursor (SP-alpha) GEGRLEVL 11 (IgM-associated peptide) G+GRLE+LSbjct 152 GQGRLEIL 159 93 2_G2 gi|6706619|emb|AJ25197 GSRVRMSG 14gi|11322769|emb|CAC16957. Score = 26.5 bits (55), Expect = 7.13.1|HSA251973 KKKERK gap junction protein, alpha Id = 8/10 (80%),Positives = 8/10 (80%), Homo sapiens partial 3, 46 kDa (connexin 46)Gaps = 0/10 (0%) steerin-1 gene gi|119607339|gb|EAW86933. Query 4 Length= 200033 1| centrosome and spindle VRMSGKKKER 13 Features in this partof pole associated protein 1, VRM  KKKER subject sequence:gi|119578900|gb|EAW58496. Sbjct 100 Steerin-1 protein nucleolar protein)family VRMEEKKKER 109 Score = 662 bits (334) 6 (RNA-associated), isoformScore = 24.4 bits (50), Expect = 31 Expect = 0.0gi|20141241|sp|P50454|SERP Id = 7/8 (87%), Positives = 8/8 (100%), Id= 370/380 (97%) H_HUMANSerpin H1 Gaps = 0/8 (0%) Gaps = 2/380 (0%)precursor (Collagen- Query 6 Strand = Plus/Minus binding protein)MSGKKKER 13 (Proliferation-inducing gene +SGKKKER 14 protein) sbjct 5gi|30583027|gb|AAP35758.1| LSGKKKER 12 serine (or cysteine) Score = 24.0bits (49), Expect = 41 proteinase inhibitor, clade Id = 7/10 (70%),Positives = 9/10 (90%), H (heat shock protein 47), Gaps = 0/10 (0%)member 2 Query 4 VRMSGKKKER 13 VR+S KKK+R Sbjct 95 VRLSEKKKDR 104 9410_F gi|21411332|gb|BC03101 NSSVS 5 gi|119626680|gb|EAX06275.1 Score= 17.2 bits (33), Expect = 2312 8 2.1| alpha-kinase 1, isoform Id = 5/5(100%), Positives = 5/5 (100%), Homo sapiens gi|119624697|gb|EAX04292.1Gaps 0/5 (0%) eukaryotic translation ectonucleotide Query 1 elongationfactor 1 pyrophosphatase/phospho- NSSVS 5 gamma, diesterase 4 NSSVS mRNA(cDNA clone gi|119597572|gb|EAW77166. Sbjct 3340 MGC: 32765 1| AT hookcontaining NSSVS 3344 IMAGE: 4654721), transcription factor 1 completecds gi|119583836|gb|EAW63432. Length = 1441 1| testis expressed sequenceScore = 965 bits (487) 15, isoform CRA_a Expect = 0.0gi|118572823|sp|Q96QP1|AL Id = 529/550 (96%) PK1_HUMAN Lymphocyte Gaps= 0/550 (0%) alpha-protein kinase Strand = Plus/Plusgi|91208166|sp|Q96Q15|5MG 1_HUMANSerine/threonine -protein kinase SMG195 10_H gi|33869450|gb|BC01719 SICA 4 gi|119621016|gb|EAX00611.1 Score= 15.9 bits (30), Expect = 4455 3 4.2| solute carrier family 30 Id = 4/4(100%), Positives = 4/4 (100%), Homo sapiens E74-like (zinctransporter),member 3 Gaps = 0/4 (0%) factor 4 (ets domaingi|119629012|gb|EAXO8607. Query 1 transcription factor), FRAS1 relatedextracellular SICA 4 mRNA (cDNA clone matrix protein 2 SICA MGC: 1755gci|119590121|gb|EAW69715. Sbjct 741 IMAGE: 3138355), 1| nuclearVCP-like SICA 744 complete cds gi|110282976|sp|P46020|KPB Length = 41211_HUMAN Score = 904 bits (456), Phosphorylase b kinase Expect = 0.0regulatory subunit alpha, Id = 464/467 (99%) skeletal muscle isoformGaps = 0/467 (0%) gi|62087778|dbj|BAD92336. Strand = Plus/Plusrearranged L-myc fusion sequence variant 96 11_C gi|109148511|ref|NR_003QLRISTTRS 11 gi|119628939|gb|EAX08534.1 Score = 24.0 bits (49), Expect= 41 1 089.1| Homo sapiens WT hCG2042156 Id = 7/10 (70%), Positives= 8/10 (80%), OTU domain, ubiquitin gi|116496611|eb|AA126137.1| Gaps= 0/10 (0%) aldehyde binding 1 DNASE2B protein Query 1 (OTUB1),transcript gi|46395921|sp|Q8WZ79|DN QLRISTTRSW 10 variant 2, transcribedS2B HUMAN QLR ST R+W RNA Deoxyribonuclease-2-beta Sbjct 88 Length = 2518gi|17066106|emb|CAD12457. QLRDSTARAW 97 Score = 829 bits (418)1| Novex-3 Titin Isoform Score = 23.1 bits (47), Expect = 75 Expect= 0.0 gi|119610438|gb|EAW90032. Id = 6/6 (100%), Positives = 6/6 (100%),Id = 425/428 (99%) 1| netrin 1 Gaps = 0/6 (0%) Gaps = 0/428 (0%)gi|119582311|gb|EAW61907. Query 5 Strand = Plus/Plus 1| centaurin, delta3, STTRSW 10 gi|47779175|gb|AAT38470.1| STTRSW immunoglobulin heavySbjct 68 chain STTRSW 73 gi|5456914|gb|AAD43707.1| Score = 22.3 bits(45), Expect = 134 protocadherin alpha 5 Id = 6/8 (75%), Positives = 7/8(87%), gi|4502681|ref|NP_001772.1| Gaps = 0/8 (0%) CD69 antigen (p60,early T- Query 3 cell activation antigen) RISTTRSW 10gi|584906|sp|Q07108|CD69_ RIS+T SW HUMAN Early activation Sbjct 3491antigen CD69 RISSTSSW 3498 (Early T-cell activation Score = 16.8 bits(32), Expect = 6145 antigen p60) Id = 6/9 (66%), Positives = 7/9 (77%),Gaps = 2/9 (22%) Query 3 RISTT--RS 9 RIST+  RS Sbjct 312 RISTSPIRS 32098 2_C3 gi|30584378|gb|BT00777 GRVIQE 36 gi|28892|emb|CAA35582.1| Score= 28.6 bits (60), Expect = 1.4 0.1| Synthetic construct PGGRGD unnamedprotein product Identities = 12/22 (54%), Positives = 14/22 (63%), Homosapiens cystatin B KLLHQG gi|2135396|pir||S71548 Gaps = 3/22 (13%)(stefin B) mRNA ARRRRG homeotic protein pG2- Query 15 Score = 280bit(141) LRTPAS human LHQGARRRRGLRTPASVPISPS 36 Expect = 1e−72 VPISPSgi|48428276|sp|O43432|IF4G L QG RRR  L   +SVP +PS Id = 141/141 (100%)3_HUMAN Sbjct 227 Strand = Plus/Plus Eukaryotic translationLQQGGRRRGDL---SSVPTAPS 245 initiation factor 4 gamma 3 Score = 28.6 bits(60), Expect = 1,3 (eIF-4-gamma II) Identities = 12/22 (54%), Positives= 14/22 (63%), gi|2190402|emb|CAA73944.1| Gaps = 3/22 (13%) latentTGF-beta binding Query: 15 protein-4 LHQGARRRRGLRTPASVPISPS 36 L QGRRR  L   +SVP +PS Sbjct: 228 LQQGGRRRGDL---SSVPTAPS 246 Score = 26.9bits (56), Expect = 4.3 Identities = 15/28 (53%), Positives = 17/28(60%), Gaps = 7/28 (25%) Query: 1 GRVIQEPGGRGDKLLHQGARR-----RR 23 GR  QPGGRG LL+G+RR      RR Sbjct: 685 GR--QTPGGRGVPLLNVGSRRSQPGQRR 710 108_D4 gi|76779425|gb|BC10604 DMSYK 5 gi|59798440|sp|Q9UNA4|PO Score= 18.5 bits (36), Expect = 933  6 6.1| L1_HUMAN Id = 4/5 (80%),Positives = 5/5 (100%), Homo sapiens pre-B-cell DNA polymerase iota Gaps= 0/5 (0%) colony enhancing factor (RAD30 homolog B) (Eta2) Query 1 1,transcript variant 1, gi|56757695|sp|Q9ULC6|PA DMSYK 5 mRNA (cDNA cloneDI1_HUMANProtein- +MSYK MGC: 117256 arginine deiminase type I Sbjct 104IMAGE: 6161081), (Peptidylarginine deiminase EMSYK 108 complete cds I)Score = 18.0 bits (35), Expect = 914 Length = 2144gi|3122589|sp|O15355|PP2C Id = 4/4 (100%), Positives = 4/4 (100%), Score= 256 bits (129) G_HUMAN Protein Gaps = 0/4 (0%) Expect = 3e−65phosphatase 2C gamma Query 1 Id = 144/151 (95%) isoform DMSY 4 Gaps= 0/151 (0%) gi|68067549|sp|P07858|CAT DMSY Strand = Plus/Plus B_HUMANCathepsin B Sbjct 162 precursor (Cathepsin B1) DMSY 165gi|12585368|sp|QI6890|TPD5 3_HUMAN Tumor protein D53 (hD53)gi|52000845|sp|Q9UL12|SAR H_HUMAN Sarcosine dehydrogenase, mitochondrialprecursor (SarDH) (BPR-2) gi|1711371|sp|P54764|EPHA4 _HUMAN Ephrintype-A receptor 4 precursor (Tyrosine-protein kinase receptor SEK)gi|17380181|sp|O60508|PR17 HUMAN Pre-mRNA splicing factor PRP17 (hPRP17)(Cell division cycle 40 homolog) (EH-binding protein 3)gi|62087974|dbj|BAD92434. protein phosphatase 1G variant 10 1_D6gi|20357564|ref|NM_0001 RGDKLLHQ 27 gi|28892|emb|CAA355821| Score = 28.6bits (60), Expect = 1.6  8 00.2| Homo sapiens GARRRRGL unnamed proteinproduct Id = 12/22 (54%), Positives = 14/22 (63%), cystatin B (stefin B)RTPASVPIS gi|2135396|pir||S71548 Gaps = 3/22 (13%) (CSTB), mRNA PShomeotic protein pG2- Query 6 Length = 674 human LHQGARRRRGLRTPASVPISPS27 Score = 232 bits (117) gi|38648796|gb|AAH63316.1| L QG RRR  L   +SVP+PS Expect = 9e−59 FBXL17 protein Sbjct 228 Id = 117/117 (100%)LQQGGRRRGDL---SSVPTAPS 246 Gaps = 0/117 (0%) Score = 28.2 bits (59),Expect = 2.1 Strand = Plus/Plus Id = 11/18 (61%), Positives = 11/18(61%), Gaps = 5/18 (27%) Query 11 RRRRGL-----RTPASVP 23 RRRR L     RTPAVP Sbjct 25 RRRRPLLRLPRRTPAKVP 42 10 4_G4 gi|56788032|gb|AY69246 RPARSR14 gi|57163116|emb|CA139857.1 Score = 26.5 bits (55), Expect = 5.0  94.1| RMMAW protein (peptidyl-prolyl Id = 7/8 (87%), Positives = 7/8(87%), Homo sapiens growth- GKA cis/trans isomerase) NIMA- Gaps = 0/8(0%) inhibiting gene 46 interacting, 4 (parvulin) Query 1 mRNA, completecds gi|32171215|ref|NP_859066.1 RPARSRRM 8 Length = 1715 transducer ofregulated RP RSRRM Score = 825 bits (416) cAMP response element- Sbjct14 Expect = 0.0 binding protein (CREB) 2 RPERSRRM 21 Id = 416/416(100%)gi|59800455|sp|Q9UPY6|WA Score = 23.5 bits (48), Expect = 40 Gaps = 0/416(0%) SF3_HUMAN Id = 6/7 (85%), Positives = 6/7 (85%), Strand= Plus/Minus Wiskott-Aldrich syndrome Gaps = 0/7 (0%) protein familymember 3 Query 6 (WASP-family protein RRMMAWG 12 member 3) RR MAWGgi|4927214|gb|AAD33054.1| Sbjct = 144 Scar3 RRTMAWG 150gi|57013057|sp|Q6STE5|SMR Score = 23.1 bits (47), Expect = 53 D3_HUMANId = 8/14 (57%), Positives = 9/14 (64%), SW1/SNF-related matrix- Gaps= 4/14 (28%) associated actin-dependent Query 1 regulator of chromatinRPARSRR----MMA 10 subfamily D member 3 R AR+RR    MMAgi|19421557|gb|AAK56405.1 Sbjct 149 chromodomain helicase RKARNRRQEWNMMA162 DNA binding protein 5 gi|1399745|gb|AAB08988.1|myelodysplasia/myeloid leukemia factor 2 gi|73921220|sp|Q86WB0|NIPA_HUMAN Nuclear- interacting partner of anaplastic lymphoma kinase)(hNIPA) gi|41019490|sp|P49736|MCM 2_HUMAN DNA replication licensingfactor MCM2 (Nuclear protein DM28) gi|1706888|sp|P53539|FOSB_(—) HUMANProtein fosB (G0/G1 switch regulatory protein 3) 11 10_Egi|123998528|gb|D08959 AETVGP 29 gi|113422952|ref|XP_001129 Score = 26.5bits (55), Expect = 6.8  2 8 40.2| GREEGC 208.1| PREDICTED: Id = 11/16(68%), Positives = 11/16 (68%), Synthetic construct WQRGRP hypotheticalprotein Gaps = 5/16 (31%) Homo sapiens clone NEETTCgi|34785506|gb|AAH57760.1| Query 5 IMAGE: 3938260; PSSRS MORN repeatcontaining 3 GPGREEGC--W-QRGR 17 FLH191546.01L;gi|119622706|gb|EAX02301.1 GPGR  GC  W QRGR RZPDo839E0667D proproteincouvertase sbjct 101 ribosomal protein L7a subtilisin/kexin type 6,GPGR--GCGGWVQRGR 114 (RPL7A) gene, encodes isoformn CRA_a Score = 25.2bits (52), Expect = 16 complete protein Id = 6/7 (85%), Positives = 7/7(100%), Length = 841 Gaps = 0/7 (0%) Score = 678 bits (342) Query 10Expect = 0.0 EGCWQRG 16 Id = 342/342 (100%) EGCW+RG Gaps = 0/342 (0%)Sbjct 161 Strand = Plus/Plus EGCWERG 167 Score = 24.8 bits (51), Expect= 22 Id = 7/7 (100%), Positives = 7/7 (100%), Gaps = 0/7 (0%) Query 4VGPGREE 10 VGPGREE Sbjct 620 VGPGREE 626 11 8_H1 gi|114520617|ref|NM_021PIDGLATSA 94 gi|119623272|gb|EAX02867.1 Score = 65.1 bits (146), Expect= 1e−10  4 0 130.3| Homo sapiens IMACEVTT hCG1792883, isoform Id = 23/35(65%), Positives = 27/35 (77%), peptidylprolyl isomerase LTHKPWNNgi|119623270|gb|EAX02865.1 Gaps = 1/35 (2%) A (cyclophilin A) SVKAGTLIThCG1792883, isoform Query41 (PPIA), mRNA KSFLSSAQSgi|119624576|gb|EAX04171.1 AQSTKIFCCLWDLVCKQLKGDAAQGLAVDGNVKEQ 75 Length= 2276 TKIFCCLW solute carrier family 22 AQS KIFCCLW+ V KQL  DAAQGL+ G+V+E Score = 436 bits (220) DLVCKQLK gi|119577790|gb|EAW57386.Sbjct56 Expect = 5e−120 GDAAQGLA 1| dystrophia myotonica-AQSMKIFCCLWNFVYKQL-EDAAQGLTMGGDVEEH 89 Id = 223/224 (99%) VDGNVKEQcontaining WD repeat Score = 58.3 bits (130), Expect = 1e−08 Gaps= 0/224 (0%) SIHKLHNTA motif, isoform CRA_a Id = 20/31 (64%), Positives= 24/31 (77%), Strand = Plus/Minus RXSLRPHSS gi|119570084|gb|EAW49699.Gaps = 1/31 (3%) N 1| F-box and leucine-rich Query 45 repeat protein 15KIFCCLWDLVCKQLKGDAAQGLAVDGNVKEQ 75 gi|119570183|gb|EAW49798. KIFCCLW+ VKQL  DAAQGL + G+V+E 1| mitochondrial ribosomal Sbjct 2 protein L43,isoform CRA_c KIFCCLWNFVYKQL-EDAAQGLTMGGDVEEH 31gi|32165518|gb|AAP72126.1| Score = 27.4 bits (57), Expect = 25 Gprotein-coupled receptor Id = 10/15 (66%), Positives = 11/15 (73%), 120Gaps = 4/15 (26%) Query 49 CL---WDLVCKQLKG 60 CL   WDLVC+Q KG Sbjct 136CLSLQWDLVCEQ-KG 149 11 6_G8 gi|30584378|gb|BT00777 GRVIQE 36gi|28892|emb|CAA35582.1| Score = 28.6 bits (60), Expect = 1.4  50.1| Synthetic construct PGGRGD unnamed protein product Identities= 12/22 (54%), Positives = 14/22 (63%), Homo sapiens cystatin B KLLHQCgi|2135396|pir||S71548 Gaps = 3/22 (13%) (stefin B) mRNA ARRRRG hoemoticprotein pG2 Query 15 Score = 280 bit(141) LRTPASgi|48428276|sp|O43432|IF4G LHQGARRRRGLRTPASVPISPS 36 Expect = 1e−72VPISPS 3_HUMAN L QG RRR L   +SVP +PS Id = 141/141 (100%) Eukaryotictranslation Sbjct 227 Strand = Plus/Plus initiation factor 4 gamma 3LQQGGRRRGDL---SSVPTAPS 245 (eIF-4-gamnma 3) Score = 28.6 bits (60),Expect 1.3 (eIF-4-gamma II) Identities = 12/22 (54%), Positives = 14/22(63%), gi|2190402|emb|CAA73944.1| Gaps = 3/22 (13%) latent TGF-betabinding Query: 15 protein-4 LHQGARRRRGLRTPASVPISPS 36 L QG RRR  L   +SVP+PS Sbjct: 228 LQQGGRRRGDL---SSVPTAPS 246 Score = 26.9 bits (56), Expect= 4.3 Identities = 15/28 (53%), Positives = 17/28 (60%), Gaps = 7/28(25%) Query: 1 GRVIQEPGGRGDKLLHQGARR-----RR 23 GR  QPGGRG  LL+ G+RR     RR Sbjct: 685 GR--QTPGGRGVPLLNVGSRRSQPGQRR 710 111_D4 gi|20357564|ref|NM_0001 GRVIQE 34 gi|28892|emb|CAA35582.1| Score= 28.6 bits (60), Expect = 1.4  7 00.2| Homo sapiens PGGRGD unnamedprotein product Identities = 12/22 (54%), Positives = 14/22 (63%),cystatin B (stefin B) KLLHQG gi|2135396|pir||S71548 Gaps = 3/22 (13%)(CSTB) cystatin B (liver ARRRRG homeotic protein pG2 Query 15 thiolproteinase LRIPAS gi|48428276|sp|O43432|IF4G LHQGARRRRGLRTPASVPISPS 36inhibitor), mRNA VPISPS 3_HUMAN L QG RRR L    +SVP +PS Length = 674Eukaryotic translation Sbjct 227 Score = 412 bits (208) initiationfactor 4 gamma 3 LQQGGRRRGDL---SSVPTAPS 245 Expect = 8e−113 (eIF-4-gamma3) Score = 28.6 bits (60), Expect = 1.3 Id = 208/208 (100%) (eIF-4-gammaII) Identities = 12/22 (54%), Positives = 14/22 (63%), Gaps = 0/208 (0%)gi|2190402|emb|CAA73944.1| Gaps = 3/22 (13%) Strand = Plus/Plus latentTGF-beta binding Query: 15 protein-4 LHQGARRRRGLRTPASVPISPS 36 L QGRRR  L   +SVP +PS Sbjct: 228 LQQGGRRRGDL---SSVPTAPS 246 Score = 26.9bits (56), Expect = 4.3 Identities = 15/28 (53%), Positives = 17/28(60%), Gaps = 7/28 (25%) Query: 1 GRVIQEPGGRGDKLLHQGARR-----RR 23 GR  QPGGRG  LL+ G+RR     RR Sbjct: 685 GR--QTPGGRGVPLLNVGSRRSQPGQRR 710 116_C4 gi|71559137|ref|NM_0010 SSQGHF 6 gi|21753429|dbj|BAC04343.1 Score= 21.8 bits (44), Expect = 78  8 29.3| Unnamed protein product Id = 6/6(100%), Pos = 6/6 (100%), Homo sapiens ribosomalgi|12643872|sp|Q9UBS0|KS6 Gaps = 0/6 (0%) protein S26 (RPS26), B2_HUMANQuery 1 mRNA Ribosomal protein S6 SSQGHF 6 Length = 699 kinase 2 (S6K2)SSQGHF Score = 367 bits (185) (70 kDa ribosomal protein) Sbjct 74 Expect= 1e−98 (S6 kinase-related kinase) SSQGHF 79 Id = 202/210 (96%)(Serine/threonine-protein Score = 18.0 bits (35), Expect = 1097 Gaps= 0/210 (0%) kinase 14 beta) Id = 5/5 (100%), Pos = 5/5 (100%), Strand= Plus/Plus Colony stimulating factor 1 Gaps = 0/5 (0%) Roquin (RINGfinger and Query 1 C3H zinc finger protein 1) SSQGH 5 G protein coupledreceptor SSQGH 123 Sbjct 189 Solute carrier family 44, SSQGH 193 member5 Choline transporter-like protien 5 12 1_D8 gi|123997386|gb|DQ8953LPRVQAQG 64 gi|48429165|sp|Q86V81|THO Score = 36.7 bits (79), Expect= 0.016  1 69.2| GRVPEETE C4_HUMAN Id = 19/37 (51%), Positives = 21/37(56%), Synthetic construct GAGGGRGR THO complex subunit 4 Gaps = 14/37(37%) Homo sapiens clone QGRAGAPA (Tho4) (Ally of AML-1 and Query8IMAGE: 3505011; GRGTAAAQ LEF-1) (TranscriptionalGGRVPEETEGAGGGRGRQGRAGAPAGRGTAAAQGGAE 44 FLH183687.01L; GGAELGAEcoactivator Aly/REF) (bZIP GGR        GGGRGR GRAG+  GRG     GGA+RZPDo839F06141D AGGDAQEG enhancing factor BEF) SbJct21 CDC37 celldivision SLRPHSSN gi|47117879|sp|P83369|LSM1GGR--------GGGRGR-GRAGSQGGRG-----GGAQ 43 cycle 37 homolog (S. 1_HUMANScore = 21.0 bits (42), Expect = 848 cerevisiae) (CDC37) U7snRNA-associated Sm- Id = 10/16 (62%), Positives = 10/16 (62%), gene,encodes complete like protein LSm11 Gaps = 4/16 (25%) proteingi|30802096|gb|AAH51353.1| Query 19 Length = 1177 LSM11 proteinGGGRGRQGRAGAPAGR 34 Score = 333 bits (168) GG RGR GR    AGR Expect= 2e−88 Sbjct 221 Id = 168/168 (100%) GGARCR-GRG---AGR 232 Gaps = 0/168(0%) Score = 35.0 bits (75), Expect = 0.052 Strand = Plus/Plus Id= 17/24 (70%), Positives = 17/24 (70%), Gaps = 2/24 (8%) Query 19GGGRGRQGRA-GAPAGRGTAAAQG 41 GGGRGR GRA GA AG G  AA G Sbjct 69GGGRGR-GRARGAAAGSGVPAAPG 91 12 5_C6 gi|75992939|ref|NM_0046 VVSPPSSAR 22gi|119592058|gb|EAW71652 Score = 29.1 bits (61), Expect = 1.2  251.3| Homo sapiens PACVCPSSS 1| hCG2006056, isoform Id = 11/17 (64%),Positives = 12/17 (70%), ubiquitin specific DPPF CRA_d Gaps = 4/17 (23%)peptidase 11 (USP11), gi|119618258|gb|EAW97852. Query 5 mRNA1| acetyl-Coenzyme A PSSARPACVCPSS--SD 19 Length = 3300 carboxylasebeta, isoform PS  RP+CVCP S  SD Score = 446 bits (225)gi|14589876|ref|NP_115835.1 Sbjct 66 Expect = 5e−123 embryonalFyn-associated PS--RPSCVCPCSARSD 80 Id = 225/225(100%) substrate isoform2; Efs2 -- Score = 27.4 bits (57), Expect = 3.8 Gaps = 0/225(0%)gi|29336893|sp|Q96DN6|MB Id = 10/19 (52%), Positives = 11/19 (57%),Strand = Plus/Plus D6_HUMAN Methyl-CpG- Gaps = 8/19 (42%) binding domainprotein 6 Query 4 gi|51466429|ref|XP_380018.2 PPSSARPACVCPSSSDPPF 22similar to Ankyrin repeat PPSSARPA        PP+ and IBR domain-containingSbjct 254 protein 1; ANK1B1 protein PPSSARPA--------PPY 264gi|88943889|sp|Q70EK8|UBP Score = 26.5 bits (55), Expect = 6.9 53_HUMANId = 12/21 (57%), Positives = 13/21 (61%), Inactive ubiquitin carboxyl-Gaps = 6/21 (28%) terminal hydrolase 53 Query 1 VVSPPSSARPACVCPSSSDPP 21VV PP  ARP   CP+S  PP Sbjct 9 VVPPP--ARP---CPTSC-PP 23 12 6_F4gi|17426141|gb|AF32532 NSKE 4 gi|119626370|gb|EAX05965.1 Score = 18.0bits (35), Expect = 1284  4 6.1|F325326S0| Homo protein tyrosine Id= 5/5 (100%), Positives = 5/5 (100%), sapiens macrophin 1 phosphatase,non-receptor Gaps = 0/5 (0%) isoforms (MACF1) gene, type 13 Query 1 exon3 gi|119617338|gb|EAW96932. NSSKE 5 Length = 2486 1| ubiquitin specificNSSKE Score = 803 bits (405) peptidase 52, isoform Sbjct 916 Expect= 0.0 CRA_e NSSKE 920 Id = 405/405 (100%) gi|119585128|gb|EAW64724. Gaps= 0/405 (0%) 1| kinesin family member Strand = Plus/Plus 15gi|62087388|dbj|BAD92141.1 protein tyrosine phosphatase, non-receptortype 13 12 1_G8 gi|51317362|ref|NM_0024 AMRASRRR 79gi|4502919|ref|NP_001288.1| Score = 35.4 bits (76), Expect = 0.056  573.3| Homo sapiens FSSNARAPG cyclic nucleotide gated Id = 20/43 (46%),Positives = 23/43 (53%), myosin, heavy chain 9 GXHRPAGG channel beta 1Gaps = 16/43 (37%) non-muscle (MYH9), GAGGGAGQ gi|45476939|sp|Q8IX07|FOGQuery 39 mRNA HGADQRPA 1_HUMANRPA--EEGQPADRPDQH--R------PEPGAQ----PRPEERE 67 Length = 7474 EEGQPADRZinc finger protein ZFPM1 RP   EEG PA+ P++H  R      PEPG Q      PEEREScore = 626 bits (316) PDQHRPE (Friend of GATA protein 1) Sbjct 1201Expect = 4e−177 PGAQPRPE gi|4210366|emb|CAA10317.1|RPEGEEEG-PAE-PEEHSVRICMSPGPEPGEQILSVKMPEERE 1241 Id = 335/342 (97%)ERECSAAA APC2 protein Score = 24.0 bits (49), Expect = 157 Gaps = 1/342(0%) GTPEQGA gi|5031587|ref|NP_005874.1| Id = 17/49 (35%), Positives= 18/48 (37%), Strand = Plus/Plus adenomatosis polyposis coli 2 Gaps= 27/48 (56%) gi|21360802|gb|AAM49715.1| Query 42 hepatoma-derivedgrowth EEGQPADRPDQH-------------R--PE-PGAQP---------RPE 64 factor-HGDF5EEG  A  PDQH             R  PE PG+ P         RPEgi|21263499|sp|Q9BZ76|CNT Sbjct 1158 P3_HUMAN Contactin-EEGSAA--PDQHTHPREAATDPPAPRTPPEPPGSPPSSPPPASLCRPE 1203 associatedprotein-like 3 Score = 32.5 bits (69), Expect = 0.44 precursor (Cellrecognition Id = 21/41 (51%), Positives = 22/41 (53%), molecule Caspr3)Gaps = 14/41 (34%) gi|6002605|gb|AAF00055.1 Query 44 transcriptionfactor TBLYM GQPADRPDQHR--PEPGAQPRPEERECSAAAG---TPEQGA 79gi|7341310|gb|AAF61243.1| GQPA+ PD  R  P PGA  R EE      AG   TPE GAT-cell-specific T-box Sbjct 625 transcription factor T-betGQPAE-PDAPRSSPGPGA--R-EE-----GAGGAATPEDGA 656 gi|42490771|gb|AAH66122.1|Score = 32.0 bits (68), Expect = 0.59 BCR protein Id = 26/57 (45%),Positives = 28/57 (49%), gi|82546843|ref|NP_004318. Gaps 19/57 (33%)breakpoint cluster region Query 24 isoform 1GGGAGGGAGQHGADQRFAEEGQPADRPDQHRPEP-GAQPR----PE-E--RECSAAA 72 GGGAGG AG HA  R  EEG          P P G++PR     E E  REC  AA Sbjct 1305GGGAG-AGLHFAGHRRRREEG----------PAPTGSRPRGAADQELELLRECLGAA 1350 Score= 19.7 bits (39), Expect = 2966 Id = 5/6 (83%), Positives = 6/6 (100%),Gaps = 0/6 (0%) Query 37 DQRPAE 42 D+RPAE Sbjct 1658 DERPAE 1663 Score= 18.9 bits (37), Expect = 5340 Id = 10/16 (62%), Positives = 10/16(62%), Gaps = 4/16 (25%) Query 13 ARAPGGXHRPAGGGAG 28 AR  GG   P GGGAGSbjct 390 ARD-GG---PEGGGAG 401 Score = 17.6 bits (34), Expect = 12900 Id= 7/9 (77%), Positives = 7/9 (77%), Gaps = 0/9 (0%) Query 24 GGQAGGGAG32 GG  GGGAG Sbjct 393 GGPEGGGAG 401

TABLE 6b Peptide sequences Description of Mimo- of the genes topes, Sizethat are in in-frame of Description of the Mimotope with T7 pep-sequences that Rank Clone clones 10 B gene tide Mimotopes mimic Regionof similarity of peptide  7 2_D4 gi|37790795| WDCATACQ 29gi|34535600|dbj|BAC87373.1| Score = 37.5 bits (81), gb|AY42221 PGSQRDSVSUnnamed protein product Expect = 0.003 1.1| KKKKKKGGgi|1350799|sp|P49646|YYY1 Id = 14/25 (56%), Homo sapiens XGKN Very veryhypothetical Positives = 16/25 (64%), cholesteryl protein RMSA-1 Gaps= 9/25 (36%) ester trans- gi|8928194|sp|Q9UJU2|LEF1 Query 8 fer protein,Lymphoid enhancer binding QPGS         QRDSVSKKKKKK 23 plasma (CETP)factor 1 (LEF-1) +PGS         +RDSVSKKKKKK gene, com-gi|1703273|sp|P50579|AMPM Sbjct 127 plete cds 2_HUMANEPGSCHSTPAWATERDSVSKKKKKK 151 Lengh = 25612 Initiation factor 2 assoc-Score = 16.8 bits (32), Score = 99.6 iated 67 kDa glycoprotein Expect= 5786 bits (50) (p67) (p67eIF2) Id = 6/11 (54%), Expect = 2e−gi|29337146|sp|Q9NQB0|TF7 Positives = 7/11 (63%), 18 L2_HUMAN Gaps= 0/11 (0%) Id = 50/50 Transcription factor 7- Query 13 (100%) like 2RDSVSKKKKKK 23 Gaps = 0/50 gi|71153825|sp|Q9BQG0|MB R+ VS K  KK (0%)B1A_HUMAN Sbjct 85 Strand = Myb-binding protein 1A RNPVSTKSTKK 95Plus/Minus E2F-associated phosphoprotein (EAPP)gi|12644154|sp|P20042|IF2B_(—) HUMAN Eukaryotic translation ini- tiationfactor 2 subunit 2 Metastasis adhesion protein (Metadherin) Programmedcell death protein 11 (T cell-specific transcrip- tion factor 1-alpha)(TCF1- alpha) gi|2822161|gb|AAB97937.1| rab3 effector-like 8 2_D8gi|37790795| WDCATA 29 gi|34535600|dbj|BAC87373.1| Score = 37.5 bits(81), gb|AY42221 CQPGSQ Unnamed protein product Expect = 0.003 1.1|RDSVSK gi|1350799|sp|P49646|YYY1 Id = 14/25 (56%), Homo sapiens KKKKKRVery very hypothetical Positives = 16/25 (64%), cholesteryl G proteinRMSA-1 Gaps = 9/25 (36%) ester trans- gi|8928194|sp|Q9UJU2|LEF1 Query 8fer protein, Lymphoid enhancer binding QPGS         QRDSVSKKKKKK 23plasma (CETP) factor 1 (LEF-1) +PGS         +RDSVSKKKKKK gene, com-gi|1703273|sp|P50579|AMPM Sbjct 127 plete cds 2_HUMANEPGSCHSTPAWATERDSVSKKKKKK 151 Lengh = 25612 Initiation factor 2 assoc-Score = 16.8 bits (32), Score = 99.6 iated 67 kDa glycoprotein Expect= 5786 bits (50) (p67) (p67eIF2) Id = 6/11 (54%), Expect = 2e−gi|29337146|sp|Q9NQB0|TF7 Positives = 7/11 (63%), 18 L2_HUMAN Gaps= 0/11 (0%) Id = 50/50 Transcription factor 7- Query 13 (100%) like 2RDSVSKKKKKK 23 Gaps = 0/50 gi|71153825|sp|Q9BQG0|MB R+ VS K  KK (0%)B1A_HUMAN Sbjct 85 Strand = Myb-binding protein 1A RNPVSTKSTKK 95Plus/Minus E2F-associated phosphoprotein (EAPP)gi|12644154|sp|P20042|IF2B_(—) HUMAN Eukaryotic translation ini- tiationfactor 2 subunit 2 Metastasis adhesion protein (Metadherin) Programmedcell death protein 11 (T cell-specific transcrip- tion factor 1-alpha)(TCF1- alpha) gi|2822161|gb|AAB97937.1| rab3 effector-like 9 2_Ggi|37790795| WDCATA 29 gi|34535600|dbj|BAC87373.1| Score = 37.5 bits(81), 11 gb|AY42221 CQPGSQ Unnamed protein product Expect = 0.003 1.1|RDSVSK gi|1350799|sp|P49646|YYY1 Id = 14/25 (56%), Homo sapiens KKKKKRVery very hypothetical Positives = 16/25 (64%), cholesteryl G proteinRMSA-1 Gaps = 9/25 (36%) ester trans- gi|8928194|sp|Q9UJU2|LEF1 Query 8fer protein, Lymphoid enhancer binding QPGS         QRDSVSKKKKKK 23plasma (CETP) factor 1 (LEF-1) +PGS         +RDSVSKKKKKK gene, com-gi|1703273|sp|P50579|AMPM Sbjct 127 plete cds 2_HUMANEPGSCHSTPAWATERDSVSKKKKKK 151 Lengh = 25612 Initiation factor 2 assoc-Score = 16.8 bits (32), Score = 99.6 iated 67 kDa glycoprotein Expect= 5786 bits (50) (p67) (p67eIF2) Id = 6/11 (54%), Expect = 2e−gi|29337146|sp|Q9NQB0|TF7 Positives = 7/11 (63%), 18 L2_HUMAN Gaps= 0/11 (0%) Id = 50/50 Transcription factor 7- Query 13 (100%) like 2RDSVSKKKKKK 23 Gaps = 0/50 gi|71153825|sp|Q9BQG0|MB R+ VS K  KK (0%)B1A_HUMAN Sbjct 85 Strand = Myb-binding protein 1A RNPVSTKSTKK 95Plus/Minus E2F-associated phosphoprotein (EAPP)gi|12644154|sp|P20042|IF2B_(—) HUMAN Eukaryotic translation ini- tiationfactor 2 subunit 2 Metastasis adhesion protein (Metadherin) Programmedcell death protein 11 (T cell-specific transcrip- tion factor 1-alpha)(TCF1- alpha) gi|2822161|gb|AAB97937.1| rab3 effector-like 16 6_Ggi|12584762| NSSGV 5 gi|119630243|gb|EAX09838.1 Score = 17.2 bits (33),12 emb|AL3910 interferon (alpha, beta and Expect = 2312 01.12| omega)receptor 1, isoform Id = 5/5 (100%), Human DNA se-gi|119609949|gb|EAW89543. Positives = 5/5 (100%), quence from 1| lectin,galactoside- Gaps = 0/5 (0%) clone RP11- binding, soluble, 3 bindingQuery 1 477H21 on protein, isoform CRA_a NSSGV 5 chromosome 1gi|119593835|gb|EAW73429. NSSGV Contains part 1| cadherin, EGF LAG Sbjct154 of the PBX1 seven-pass G-type receptor NSSGV 158 gene for pre- 1(flamingo homolog, B-cell leu- Drosophila), isoformCRA_b kemia trans-gi|56203586|emb|CA119162.1| cription fac- tripartite motif-containingtor 1 67 Length = 196730 Score = 65.9 bits (33) Expect = 2e− 08 Id= 59/71 (83%) Gaps = 0/71 (0%) Strand = Plus/Plus /gene = “PBX1”/product = “pre-B-cell leukemia transcription factor 1” 17 7_C1gi|18072476| GRGGGGGR 33 gi|56566042|gb|AAV98357.1| Score = 37.1 bits(80), emb|AL3566 GGGGRIRR nucleolar protein family A Expect = 0.00408.23| Human RKPQEPEK member 1 Id = 16/19 (84%), DNA sequence QRAEVQIQgi|119615069|gb|EAW94663. Positives = 16/19 (84%), from clone G1| CEBP-induced protein Gaps = 2/19 (10%) RP11-724N1 on Heterogeneousnuclear Query 1 chromosome 10 ribonucleoprotein U-likeGRGG-GGGRGGGGGI-GGR 17 Contains the protein 1 GRGG GGGRGGGGG  GCR 5′ endof the (Adenovirus early region Sbjct 172 CNNM2 gene 1B-associatedprotein 5) GRGGRGGGRGGGGGFRGGR 190 for cyclin M2 (E1B-55 kDa-associatedScore = 32.5 bits (69), and a CpG is- protein 5) (E1B-AP5) Expect = 0.11land, com- Id = 13/19 (68%), plete se- Positives = 14/19 (73%), quenceGaps = 3/19 (15%) Length = Query 1 161053 GRGGGG---GRGGGGGIRR 16 Score= 971 GRGGG    GRGG GG+RR bits (490) Sbjct 8 Expect = 0.0GRGGGAWGPGRGGAGGLRR 26 Id = 490/490 (100%), Gaps = 0/490 (0%) Strand =Plus/Minus 18 7_C4 gi|77744392| ADHKVRSL 11 gi|55961387|emb|CA117418.1Score = 24.4 bits (50), gb|DQ23098 RPA cAMP responsive element Expect= 22 8.1| SW1/SNF binding protein-like 1 Id = 7/7 (100%), related,(24.4) Pos = 7/7 (100%), matrix assoc- gi|1705471|sp|P55107|BMP3 Gaps= 0/7 (0%) iated, actin B_HUMAN Query 5 dependent Bone morphogeneticprotein VRSLRPA 11 regulator of 3b precursor (BMP-3b) VRSLRPA chromatin,Growth/differentiation Sbjct 151 subfamily b, factor 10 (GDF-10) (23.5)VRSLRPA 157 member 1 gene (Bone inducing protein) Score = 23.5 bits(48), Length = gi|12643326|sp|Q9ULV3|C1Z Expect = 54 50890 1_HUMANCip1-interacting Id = 9/15 (60%), Score = 125 zinc finger protein Pos= 11/15 (73%), bits (63) (Nuclear protein NP94) Gaps = 2/15 (13%) Expect= 7e− gi|62087674|dbj|BAD92284.1 Query 3 27 ELK1, member of ETSPNSSADHKVRSLRPA 17 Id = 63/63 oncogene family variant PN+SAD +VR  R A(100%) Ubiquitin specific protease Sbjct 280 Gaps = 0/63 42PNNSADPRVR--RAA 292 (0%) Chondroitin-4-O- Score = 21.0 bits (42), Strand= sulfotransferase-3 Expect = 233 Plus/Plus Id = 7/11 (63%), Positives= 9/11 (81%), Gaps = 2/11 (18%) Query 2 DHKV--RSLRP 10 DHK+  +SLRP Sbjct27 DHKIAKQSLRP 37 23 2_D3 gi|37552371| NSSLF 5gi|119623720|gb|EAX03315.1 Score = 18.5 bits (36), ref|NT_0112 DEAH(Asp-Glu-Ala-His) Expect = 957 55.14| box polypeptide 16, isoform Id= 5/5 (100%), Hs19_11412 CRA_d Positives = 5/5 (100%), Homo sapiensgi|119618562|gb|EAW98156. Gaps = 0/5 (0%) chromosome 19 1| citron(rho-interacting, Query 1 genomic con- serine/threonine kinase NSSLF 5tig, refer- 21), isoform CRA_b NSSLF ence assemblygi|5833114|gb|AAD53401.1|A Sbjct 265 Length = F107840_1 nuclear pore-NSSLF 269 7286004 associated protein Features ingi|119581856|gb|EAW61452. this part of 1| T-cell leukemia homeoboxsubject se- 3 quence: adenomatosis polyposis coli 2 Score = 793 bits(400) Expect = 0.0 Id = 411/416 (98%) Gaps = 0/416 (0%) Strand =Plus/Minus 24 3_D4 gi|555853|gb| PRQSFTLVA 9 gi|21749258|dbj|BAC03563.1Score = 25.2 bits (52), U13369.1|H Unnamed protein product Expect = 11SU13369 Human gi|14043145|gb|AAH07560.1| Id = 7/7 (100%), ribosomal DNALASP1 protein Pos = 7/7 (100%), complete re- LIM and SH3 domain Gaps= 0/7 (0%) peating unit protein 1 (LASP-1) (MLN Query 2 Length = 50)RQSFTLV 8 42999 gi|46937163|emb|CAE45323. RQSFTLV Score = 8501|LIM-nebulette alpha-1,3- Sbjct 25 bits (429) glucosyltransferase ALG8RQSFTLV 31 Expect = 0.0 isoform b Score = 25.2 bits (52), Id = 431/432Carbohydrate Expect = 17 (99%) sulfotransferase 6 Id = 7/9 (77%), Gaps= 0/432 Sortilin-related receptor Pos = 9/9 (100%), (0%) precursor Gaps= 0/9 (0%) Strand = Novel protein similar to Query 7 Plus/Plus mousemeiosis defective 1 PRQSFTLVA 15 gene Erythroid P+QSFT+VAdifferentiation-related Sbjct 58 factor 1 (EDRF1) PKQSFTMVA 66 30 9_Ggi|17907166| RVGRKVQ 7 gi|8134304|sp|Q92667|AKAP Score = 22.3 bits (45),8 emb|AL1333 1_HUMAN A kinase anchor Expect = 131 51.34| protein 1(Protein kinase A Id = 7/8 (87%), Human DNA se- anchoring protein 1)(22.3) Positives = 7/8 (87%), quence from (Spermatid A-kinase anchorGaps = 0/8 (0%) clone RP1- protein 84)(S-AKAP84) Query 6 90J20 ongi|22261|sp|P13765|2DOB_(—) SRVGRKVQ 13 chromosome HUMAN HLA class IISRV RKVQ 6p24.1-25.3 histocompatibility antigen, Sbjct 172 Contains theDO beta chain precursor SRVPRKVQ 179 5′ end of agi|57209707|emb|CA141975.1 Score = 21.4 bits (43), putative Proteinphosphatase 2 Expect = 235 novel gene, regulatory subunit B, beta Id= 6/6 (100%), the SERPINB9 Dual specificity A-kinase Positives = 6/6(100%), gene for ser- anchoring protein 1 Gaps = 0/6 (0%) ine (or cy-Solute carrier family 2 Query 8 steine) VGRKVQ 13 proteinase VGRKVQinhibitor Sbjct 117 clade B VGRKVQ 122 (ovalbumin) Score = 18.5 bits(36), member 9, a Expect = 954 novel gene, Id = 5/5 (100%), the SERPINB6Positives = 5/5 (100%), gene for Gaps = 0/5 (0%) serine (or Query 3cysteine) GRKVQ 7 proteinase GRKVQ inhibitor Sbjct 335 clade B GRKVQ 339(ovalbumin) member 6, three puta- tive novel genes, the gene for NAD(P)Hdehydrogenase quinone 2 (NQO2) and eight pre- dicted CpG islands, com-plete se- quence Length = 173670 Score = 559 bits (282) Expect = 3e− 156Id = 314/320 (98%) Gaps = 4/320 (1%) Strand = Plus/Plus 33 2_Ggi|14336700| LQPGRQSET 36 gi|34528883|dbj|BAC85594.1| Score = 55.4 bits(123), 4 gb|AE00646 PSQKKKKK Phsophorylase kinase Expect = 2e−074.1| Homo NLGGMGTG alpha/beta Id = 17/17 (100%), sapiens AHPFDPSTLgi|33415049|gb|AAQ18032.1| Positives = 17/17 (100%), 16p13.3 se- GSTransformation-related Gaps = 0/17 (0%) quence sec- protein 10 Query 1tion 3 of 8 gi|34528883|dbj|BAC85594.1| LQPGRQSETPSQKKKKK 17 Length =unnamed protein product LQPGRQSETPSQKKKKK 256073 Phsophorylase kinaseSbjct 19 Features in alpha/beta LQPGRQSETPSQKKKKK 35 this part ofPutative nucleolar Score = 44.3 bits (97), subject traffickingphosphoprotein Expect = 4e−04 sequence: SLC2A11 protein variant Id= 15/17 (88%), RAR (RAS like CLL associated antigen Positives = 15/17(88%), GTPase) like KW-2 Gaps = 0/17 (0%) Score = 103 Ras-relatedprotein H-Ras Query 1 bits (52) BCL2-associated LQPGRQSETPSQKKKKK 17Expect = 8e− athanogene isoform 1L LQPG QSETPSQRK KK 20 (BAG-1) Sbjct 37Id = 73/81 Lung cancer metastasis- LQPGLQSETPSQKKTKK 53 (90%) relatedprotein (LCMRP1) Gaps = 0/81 (0%) Strand = Plus/Plus 36 5_Ggi|119380763| HRGSPSNVG 24 gi|4837879|gb|AAD30731.1| Score = 25.7 bits(53), 3 gb|EF17744 AFRIGRESV immunoglobulin heavy Expect = 12 7.1|HomoKESLFY chain variable region Id = 8/13 (61%), sapiens iso-gi|30060232|gb|AAP13073.1| Positives = 10/13 (76%), late TA23 E3 ligasefor inhibin Gaps = 0/13 (0%) mitochon- receptor Query 11 drion, com-gi|85682779|sp|Q9ULT8|HEC FRIGRESVKESLF 23 plete genome D1_HUMAN P IR++VK SLF Length = E3 ubiquitin-protein ligase Sbjct 64 16569 HECTD1FTISRDNVKNSLF 76 Score = 105 (HECT domain-containing Score = 25.2 bits(52), bits (53) protein 1) Expect = 17 Expect = 1e− (E3 ligase forinhibin Id = 11/21 (52%), 20 receptor) (EULIR) Positives = 12/21 (57%),Id = 53/53 Gaps = 7/21 (33%) (100%) Query 9 Gaps = 0/53GAFRIGR---ESVK----ESL 22 (0%) G FR+GR   E VK    ESL Strand = Sbjct 2123Plus/Minus GEFRVGRLKHERVKVPRGESL 2143 Score = 15.1 bits (28), Expect= 19195 Id = 4/4 (100%), Positives = 4/4 (100%), Gaps = 0/4 (0%) Query 2RGSP 5 RGSP Sbjct 288 RGSP 291 Score = 25.2 bits (52), Expect = 17 Id= 11/21 (52%), Positives = 12/21 (57%), Gaps = 7/21 (33%) Query 9GAFRIGR---ESVK----ESL 22 G FR+GR   E VK    ESL Sbjct 2123GEFRVGRLKHERVKVPRGESL 2143 43 10_D gi|119380763| NSSVGC 6gi|55960173|emb|CA114537.1| Score = 17.2 bits (33), 6 gb|EF17744 HIVtype I enhancer Expect = 2774 7.1|Homo binding protein 3 Id = 5/5(100%), sapiens iso- gi|119617590|gb|EAW97184.1| Positives = 5/5 (100%),late TA23 Mdm4, transformed 3T3 Gaps = 0/5 (0%) mitochon- cell doubleminute 1, p53 Query 1 drion, com- binding protein (mouse), NSSVG 5 pletegenome isoform CRA_b NSSVG Length = gi|119604594|gb|EAW84188.1| Sbjct 3016569 IVRAB3D, member RAS NSSVG 34 Score = 446 oncogene family, isoformbits (225), CRA_a Expect = 4e− 123 Id = 261/261 (100%), Gaps = 0/261(0%) Strand = Plus/Minus 56 1_G gi|37790795| WDCATACQ 56gi|34535600|dbj|BAC87373.1| Score = 37.5 bits (81), 4 gb|AY42221PGSQRDSVS Unnamed protein product Expect = 0.019 1.1| KKKKKKGGgi|1350799|sp|P49646|YYY1 Id = 14/25 (56%), Homo sapiens XEKXXGXG HUMANPositives = 16/25 (64%), cholesteryl XFFXXXKX Very very hypotheticalGaps = 9/25 (36%) ester trans- XXXFXRXF protein RMSA-1 Query 8 ferprotein, XPXFXXK gi|537530|gb|AAB68050.1| QPGS---------QRDSVSKKKKKK 23plasma (CETP) chromnosomal protein +PGS---------+RDSVSKKKKKK gene, com-Lymphoid enhancer binding Sbjct 127 plete cds factor 1 (LEF-1)EPGSCHSTPAWATERDSVSKKKKKK 151 Length = (T cell-specific transcrip- Score= 27.4 bits (57), 25612 tion factor 1-alpha) (TCF1- Expect = 22 Score= 99.6 alpha) Id = 8/10 (80%), bits (50) Methionine aminopeptidasePositives = 10/10 (100%), Expect = 1e− 2 (MetAP 2) (Initiation Gaps= 0/10 (0%) 18 factor 2 associated 67 kDa Query 14 Id = 50/50glycoprotein) DSVSKKKKKK 23 (100%) (p67) (p67eIF2) ++VSKKKKKK Gaps= 0/50 Sbjct 265 (0%) ETVSKKKKKK 274 Strand = Plus/Minus 72 10_(—)gi|14329907| ARQVF 5 gi|57162452|emb|CA140479.1 Score = 19.3 bits (38),H11 emb|AL1624 BRCA2 (Breast cancer 2, Expect = 378 31.17| Human earlyonset) (19.3) Id = 5/5 (100%), DNA sequence gi|4502451|ref|NP_000050.1|Positives = 5/5 (100%), fr clone Breast cancer 2, early on- Gaps = 0/5(0%) RP11-46A10 on set Query 1 chromosome gi|13540359|gb|AAK9432.1|ARQVF 5 1q25.2-31.1 Mutant early onset breast ARQVF Contains 3′ cancersusceptibility pro- Sbjct 329 end of XPR1 tein ARQVF 333 gene for xen-gi|21361458|ref|NP_055601.2 otropic and Rho guanine nucleotidepolytropic exchange factor (GEF) 17 retrovirusgi|15987489|gb|AAL11991.1| receptor Tumor endothelial marker 4 Length =gi|51477779|ref|XP_067076.3 139006 PREDICTED: similar to Score = 170testis specific protein, Y- bits(86) linked 2 Expect = 7e− Crn (crookedneck-like 1) 40 Inhibitor of Bruton's Id = 117/122 tyrosine kinase(16.8) (95%) Gaps = 0/122 (0%) Strand = Plus/Minus 74 2_C1 gi|37790795|WDCATA 29 gi|34535600|dbj|BAC87373.1| Score = 37.5 bits (81), 2gb|AY42221 CQPGSQ Unnamed protein product Expect = 0.003 1.1|Homo RDSVSKgi|1350799|sp|P49646|YYY1 Id = 14/25 (56%), sapiens KKKKKR Very veryhypothetical Positives = 16/25 (64%), cholesteryl G protein RMSA-1 Gaps= 9/25 (36%) ester trans- gi|8928194|sp|Q9UJU2|LEF1 Query 8 fer protein,Lymphoid enhancer binding QPGS---------QRDSVSKKKKKK 23 plasma (CETP)factor 1 (LEF-1) +PGS         +RDSVSKKKKKK gene, com-gi|1703273|sp|P50579|AMPM Sbjct 127 plete cds 2_HUMANEPGSCHSTPAWATERDSVSKKKKKK 151 Length = Initiation factor 2 assoc- Score= 16.8 bits (32), 25612 iated 67 kDa glycoprotein Expect = 5786 Score= 99.6 (p67) (p67eIF2) Id = 6/11 (54%), bits (50)gi|29337146|sp|Q9NQB0|TF7 Positives = 7/11 (63%), Expect = 2e− L2_HUMANGaps = 0/11 (0%) 18 Transcription factor 7- Query 13 Id = 50/50 like 2RDSVSKKKKKK 23 (100%) gi|71153825|sp|Q9BQG0|MB R+ VS K  KK Gaps = 0/50B1A_HUMAN Sbjct 85 (0%) Myb-binding protein 1A RNPVSTKSTKK 95 Strand =E2F-associated Plus/Minus phosphoprotein (EAPP)gi|12644154|sp|P20042|IF2B_(—) HUMAN Eukaryotic translation initiationfactor 2 subunit 2 Metastasis adhesion protein (Metadherin) Programmedcell death protein 11 (T cell-specific transcrip- tion factor 1-alpha)(TCF1- alpha) Methionine aminopeptidase 2 (MetAP 2) (Peptidase M 2)(Initiation factor 2 associated 67 kDa glycoprotein) (p67) (p67eIF2)gi|2822161|gb|AAB97937.1| rab3 effector-like 91 2_F3 gi|37790795| WDCATA29 gi|34535600|dbj|BAC87373.1| Score = 37.5 bits (81), 2 gb|AY42221CQPGSQ Unnamed protein product Expect = 0.003 1.1|Homo RDSVSKgi|1350799|sp|P49646|YYY1 Id = 14/25 (56%), sapiens KKKKKR Very veryhypothetical Positives = 16/25 (64%), cholesteryl G protein RMSA-1 Gaps= 9/25 (36%) ester trans- gi|8928194|sp|Q9UJU2|LEF1 Query 8 fer protein,Lymphoid enhancer binding QPGS---------QRDSVSKKKKKK 23 plasma (CETP)factor 1 (LEF-1) +PGS         +RDSVSKKKKKK gene, com-gi|1703273|sp|P50579|AMPM Sbjct 127 plete cds 2_HUMANEPGSCHSTPAWATERDSVSKKKKKK 151 Length = Initiation factor 2 assoc- Score= 16.8 bits (32), 25612 iated 67 kDa glycoprotein Expect = 5786 Score= 99.6 (p67) (p67eIF2) Id = 6/11 (54%), bits (50)gi|29337146|sp|Q9NQB0|TF7 Positives = 7/11 (63%), Expect = 2e− L2_HUMANGaps = 0/11 (0%) 18 Transcription factor 7- Query 13 Id = 50/50 like 2RDSVSKKKKKK 23 (100%) gi|71153825|sp|Q9BQG0|MB R+ VS K  KK Gaps = 0/50B1A_HUMAN Sbjct 85 (0%) Myb-binding protein 1A RNPVSTKSTKK 95 Strand =E2F-associated Plus/Minus phosphoprotein (EAPP)gi|12644154|sp|P20042|IF2B_(—) HUMAN Eukaryotic translation initiationfactor 2 subunit 2 Metastasis adhesion protein (Metadherin) Programmedcell death protein 11 (T cell-specific transcrip- tion factor 1-alpha)(TCF1- alpha) Methionine aminopeptidase 2 (MetAP 2) (Peptidase M 2)gi|2822161|gb|AAB97937.1| rab3 effector-like 92 2_H gi|37790795| WDCATA29 gi|34535600|dbj|BAC87373.1| Score = 37.5 bits (81), 2 gb|AY42221CQPGSQ Unnamed protein product Expect = 0.003 1.1|Homo RDSVSKgi|1350799|sp|P49646|YYY1 Id = 14/25 (56%), sapiens KKKKKR Very veryhypothetical Positives = 16/25 (64%), cholesteryl G protein RMSA-1 Gaps= 9/25 (36%) ester trans- gi|8928194|sp|Q9UJU2|LEF1 Query 8 fer protein,Lymphoid enhancer binding QPGS---------QRDSVSKKKKKK 23 plasma (CETP)factor 1 (LEF-1) +PGS         +RDSVSKKKKKK gene, com-gi|1703273|sp|P50579|AMPM Sbjct 127 plete cds 2_HUMANEPGSCHSTPAWATERDSVSKKKKKK 151 Length = Initiation factor 2 assoc- Score= 16.8 bits (32), 25612 iated 67 kDa glycoprotein Expect = 5786 Score= 99.6 (p67) (p67eIF2) Id = 6/11 (54%), bits (50)gi|29337146|sp|Q9NQB0|TF7 Positives = 7/11 (63%), Expect = 2e− L2_HUMANGaps = 0/11 (0%) 18 Transcription factor 7- Query 13 Id = 50/50 like 2RDSVSKKKKKK 23 (100%) gi|71153825|sp|Q9BQG0|MB R+ VS K  KK Gaps = 0/50B1A_HUMAN Sbjct 85 (0%) Myb-binding protein 1A RNPVSTKSTKK 95 Strand =E2F-associated Plus/Minus phosphoprotein (EAPP)gi|12644154|sp|P20042|IF2B_(—) HUMAN Eukaryotic translation initiationfactor 2 subunit 2 Metastasis adhesion protein (Metadherin) Programmedcell death protein 11 (T cell-specific transcrip- tion factor 1-alpha)(TCF1- alpha) Methionine aminopeptidase 2 (MetAP 2) (Peptidase M 2)gi|2822161|gb|AAB97937.1| rab3 effector-like 10 5_D4 gi|16306514|YKQRRRVK 30 gi|16741704|gb|AAH16648.1| Score = 28.2 bits (59),  4gb|AC00996 REHPNRKQ Fos-related antigen 1 (FOS- Expect = 2.0 7.8|HomoLKDILLASH like antigen 1) (28.2) Id = 13/31 (41%), sapiens BAC GGPSPgi|29792148|gb|AAH50283.1| Pos = 16/31 (51%), clone RP11- WASF3 protein(27.8) Gaps = 11/31 (35%) 401O19 gi|59798973|sp|Q9H0K1|SN1 Query3 from 2L2_HUMAN QRRRVKREHPN----------RKQLKDILLA 23 Length = Serine/threonineprotein +RRRV+RE  N          RK+L D L A 176206 kinase SNF1-like kinase 2Sbjct106 Score = 694 gi|7380440|sp|P46100|ATRXERRRVRRER-NKLAAAKCRNRRKELTDFLQA 135 bits (350) _HUMAN Score = 27.8 bits(58), Expect = 0.0 Transcriptional regulator Expect = 2.0 Id = 350/350ATRX (X-linked helicase II) Id = 10/19 (52%), (100%) (X-linked nuclearprotein) Pos = 13/19 (68%), Gaps = 0/350 Fonconi anemia group A Gaps= 4/19 (21%) (0%) Protein (25.7) Query 2 Strand = E2F dimerizationpartner 2 KQRRRVKRE----HPNRKQ 16 Plus/Minus Dehydrogenase/reductaseK++RR KRE    +PNR Q SDR family member 7 Sbjct 174 Retinal short chainKEKRRQKREKHKLNPNRNQ 192 dehydrogenase/reductase 4 Score = 26.9 bits(56), Expect = 4.9 Id = 9/14 (64%), Pos = 11/14 (78%), Gaps = 2/14 (14%)Query 17 LKDILLASHGGPSP 30 LKDI+LA+   PSP Sbjct 535 LKDIMLANQ--PSP 546 7 2_D4 gi|37790795| WDCATACQ 29 gi|34535600|dbj|BAC87373.1| Score= 37.5 bits (81), gb|AY42221 PGSQRDSVS Unnamed protein product Expect= 0.003 1.1|Homo KKKKKKGG gi|1350799|sp|P49646|YYY1 Id = 14/25 (56%),sapiens XGKN Very very hypothetical Positives = 16/25 (64%), cholesterylprotein RMSA-1 Gaps = 9/25 (36%) ester trans- gi|8928194|sp|Q9UJU2|LEF1Query 8 fer protein, Lymphoid enhancer binding QPGS---------QRDSVSKKKKKK23 plasma (CETP) factor 1 (LEF-1) +PGS         +RDSVSKKKKKK gene, com-gi|1703273|sp|P50579|AMPM Sbjct 127 plete cds 2_HUMANEPGSCHSTPAWATERDSVSKKKKKK 151 Length = Initiation factor 2 assoc- Score= 16.8 bits (32), 25612 iated 67 kDa glycoprotein Expect = 5786 Score= 99.6 (p67) (p67eIF2) Id = 6/11 (54%), bits (50)gi|29337146|sp|Q9NQB0|TF7 Positives = 7/11 (63%), Expect = 2e− L2_HUMANGaps = 0/11 (0%) 18 Transcription factor 7- Query 13 Id = 50/50 like 2RDSVSKKKKKK 23 (100%) gi|71153825|sp|Q9BQG0|MB R+ VS K  KK Gaps = 0/50B1A_HUMAN Sbjct 85 (0%) Myb-binding protein 1A RNPVSTKSTKK 95 Strand =E2F-associated Plus/Minus phosphoprotein (EAPP)gi|12644154|sp|P20042|IF2B_(—) HUMAN (Metadherin) Programmed cell deathprotein 11 (T cell-specific transcrip- tion factor 1-alpha) (TCF1-alpha) Methionine aminopeptidase 2 (MetAP 2) (Peptidase M 2) (Iniationfactor 2 associated 67 kDa glycoprotein) (p67)(p67eIF2)gi|2822161|gb|AAB97937.1| rab3 effector-like  8 2_D8 gi|37790795| WDCATA29 gi|34535600|dbj|BAC87373.1| Score = 37.5 bits (81), gb|AY42221 CQPGSQUnnamed protein product Expect = 0.003 1.1|Homo RDSVSKgi|1350799|sp|P49646|YYY1 Id = 14/25 (56%), sapiens KKKKKR Very veryhypothetical Positives = 16/25 (64%), cholesteryl G protein RMSA-1 Gaps= 9/25 (36%) ester trans- gi|8928194|sp|Q9UJU2|LEF1 Query 8 fer protein,Lymphoid enhancer binding QPGS---------QRDSVSKKKKKK 23 plasma (CETP)factor 1 (LEF-1) +PGS         +RDSVSKKKKKK gene, com-gi|1703273|sp|P50579|AMPM Sbjct 127 plete cds 2_HUMANEPGSCHSTPAWATERDSVSKKKKKK 151 Length = Initiation factor 2 assoc- Score= 16.8 bits (32), 25612 iated 67 kDa glycoprotein Expect = 5786 Score= 99.6 (p67) (p67eIF2) Id = 6/11 (54%), bits (50)gi|29337146|sp|Q9NQB0|TF7 Positives = 7/11 (63%), Expect = 2e− L2_HUMANGaps = 0/11 (0%) 18 Transcription factor 7- Query 13 Id = 50/50 like 2RDSVSKKKKKK 23 (100%) gi|71153825|sp|Q9BQG0|MB R+ VS K  KK Gaps = 0/50B1A_HUMAN Sbjct 85 (0%) Myb-binding protein 1A RNPVSTKSTKK 95 Strand =E2F-associated Plus/Minus phosphoprotein (EAPP)gi|12644154|sp|P20042|IF2B_(—) HUMAN Eukaryotic translation initiationfactor 2 subunit 2 Metastasis adhesion protein (Metadherin) Programmedcell death protein 11 (T cell-specific transcrip- tion factor 1-alpha)(TCF1- alpha) Methionine aminopeptidase 2 (MetAP 2) (Peptidase M 2)(Iniation factor 2 associated 67 kDa glycoprotein) (p67)(p67eIF2)gi|2822161|gb|AAB97937.1| rab3 effector-like

TABLE 6b Peptide sequences Description of Mimo- of the genes topes, Sizethat are in in-frame of Description of the Mimotope with T7 pep-sequences that Rank Clone clones 10 B gene tide Mimotopes mimic Regionof similarity of peptide  3 10_C gi|19807899| EMKRHIST 37 gi|585487|sp|Score = 27.8 bits (58), Expect = 5.1 3 gb|AC11076 LRWKTCLNQ07325|SCYB9_HUMAN Id = 14/28 (50%), Pos = 17/28 (60%) 9.2|Homo ANMKELLESmall inducible Gaps = 11/28 (39%) sapiens BAC IKVTGKIRY cytokine B9precur- Query 12 clone RP11- NQGL sor (CXCL9)ISTLRWK----TCLN---ANMKELLEIK 32 141B14 from Gamma interferonI+TL  K    TCLN   A++ KEL IK 2, complete induced monokine Sbjct 64sequence (MIG) (27.8) IATL--KNGVQTCLNPDSADVKEL--IK 87 Length =gi|62898822|dbj| Score = 26.1 bits (54), Expect = 9.0 134317 BAD97265.1Id = 9/13 (69%), Pos = 12/13 (92%), Score = 599 Serologically de- Gaps= 1/13 (7%) bits (302) fined colon cancer Query 19 Expect = 2e− antigen33 variant MKELLEIKVTGKI 31 168 Antigen NY-CO-33 M+EL+E KVTGK+ Id= 312/317 (26.1) Sbjct 270 (98%) gi|31077164|sp| MEELVE-KVTGKV 281 Gaps= 0/317 O95757|HS74 L_HUMAN Score = 24.0 bits (49), Expect = 39 (0%)Heat shock 70 kDa Id = 6/6 (100%), Pos = 6/6 (100%), Strand = protein 4LGaps = 0/6 (0%) Plus/Plus gi|21315086|gb| Query 8 AAH30792.1| TLRWKT 13Cyclin-dependent TLRWKT kinase 5, regula- Sbjct 403 tory subunit 1TLRWKT 408 Transient receptor potential cation channel subfamily Vmember 3 Vanilloid receptor-like 3 (VRL-3) Dehydrodolichyl diphosphatesynthase Tyrosyl-tRNA synthetase Putative mitochon- drial outer mem-brane protein import receptor  5 10_G gi|21686938| CINMDSPPK 11gi|57208724|emb| Score = 24.8 bits (51), Expect = 338 8 gb|AC11605 QCCAI42568.1 Id = 6/7 (85%), Pos = 7/7 (100%), 0.3|Homo GNAS complex locusGaps = 0/7 (0%) sapiens BAC (OTTHUMP- Query 2 clone RP11- 00000031737)INMDSPP 8 427F2 from 2, Guanine nucleotide +NMDSPP complete se- bindingprotein, Sbjct 195 quence alpha stimulating VNMDSPP 201 Length =activity polypep- Score = 22.3 bits (45), Expect = 131 165225 tide 1(24.8) Id = 7/9 (77%), Pos = 7/9 (77%), Score = 343 gi|68565390|sp| Gaps= 2/9 (22%) bits (173) Q14676|MDC1_HUMAN Query 4 Expect = 3e− Mediatorof DNA da- MDSPP--KQ 10 92 mage checkpoint MDSPP  KQ Id = 179/182protein 1 (Nuclear Sbjct 1818 (98%) factor with BRCT MDSPPHQKQ 1826 Gaps= 0/182 domains 1) Score = 22.3 bits (45), Expect = 131 (0%)gi|50400452|sp| Id = 8/12 (66%), Pos = 8/12 (66%), Strand = Q7Z569|BRAPGaps = 2/12 (16%) Plus/Plus HUMAN Query 1 BRCA1-associated CINM--DSPPKQ10 protein (Impedes CIN   DSP KQ mitogenic signal Sbjct 110propagation)(IMP) CINAAPDSPSKQ 121 gi|739072|prf| 12002263A E1A-assocprotein gp130 Retinoblastoma-like protein 2 gi|55957867|emb| CA113220.1E74-like factor 1 (ets domain trans- cription factor) RUN and SH3 domaincontaining protein 2 Ezrin-radixin- moesin binding phosphoprotein 50Impedes mitogenic signal propagation (IMP) Membrane-associated nucleicacid bind- ing protein E1A- associated protein gp130 13 6_C8gi|21263318| GAGWEWV 7 gi|18089035|gb| Score = 23.1 bits (47), Expect= 38 gb|AC10444 AAH20586.1| Id = 5/5 (100%), Pos = 5/5 (100%), 1.2|HomoSERS14 protein Gaps = 0/5 (0%) sapiens (23.1) Query 3 chromosome 3gi|49256613|gb| GWEWV 7 clone RP11- AAH73912.1| GWEWV 901H12, com- ACCN4protein Sbjct 946 plete se- (22.7) GWEWV 950 quence Regenerating isletScore = 22.7 bits (46), Expect = 50 Length = derived protein 3 Id = 5/5(100%), Positives = 5/5 (100%), 177320 alpha precursor Gaps = 0/5 (0%)Score = 262 Pancreatitis assoc- Query 2 bits (132) iated protein 1 AGWEW6 Expect = 4e− (PAP) (21.8) AGWEW 67 ELAM-1 ligand Sbjct 255 Id= 160/171 fucosyltransferase AGWEW 259 (93%) Mitochondrial ribo- Gaps= 1/171 somal protein (0%) bMRP64 Strand = Plus/Minus 14 6_D4gi|15004913| PLCLASLLS 57 gi|57209194|emb| Score = 32.9 bits (70),Expect = 0.19 gb|AC00947 FIVCLFHFR CA141407.1 Id = 8/9 (88%), Positives= 9/9 (100%), 5.5|Homo YLPTILLPP Dedicator of cyto- Gaps = 0/9 (0%)sapiens BAC ILKHKCNDR kinesis 11(32.9) Query 12 clone RP11- MHLTCFGSDOCK11 protein VCLFHFRYL 20 285F23 from AKALMYSL Cdc42-associatedVCLFHFRY+ 2, complete SNNRC guanine nucleotide Sbjct 1319 sequenceexchange factor VCLFHFRYM 1327 Score = 835 ACG|DOCK11 (32.9) Score= 30.3 bits (64), Expect = 1.7 bits(421) gi|13634012|sp| Id = 14/44(31%), Pos = 20/44 (45%), Expect = 0.0 Q15884|CI61_HUMAN Gaps = 21/44(47%) Id = 424/425 Protein C9orf61 6 (99%) (Protein X123)SPLCLA-------SLLSFIVCLFHFR--------------YLPT Gaps = 0/425 (30.3) 28 (0%)Probable G-protein S +C+A       S+LSF+VC F +R              +LPT Strand =coupled receptor 37 Plus/Minus 113 precursorSRMCMAISICQMLSMLSFVVCAFRYRHMFKRGWPMGTCCLFLPT (G-protein coupled 80receptor PGR23) (29.5) Seven transmemhrane helix receptor (26.9) Tastereceptor type 2 member 7 (T2R7) (26.5) Wingless-type MMTV integrationsite family (26.5) 20 10_B gi|21747795| NSFHN 5 gi|119626686|gb| Score= 20.2 bits (40), Expect = 295 11 gb|AC12486 EAX06281.1| Id = 5/5(100%), Positives = 5/5 (100%), 4.3| Homo hCG21296, isoform Gaps = 0/5(0%) sapiens BAC CRA_c Query 1 clone RP11- gi|119615394|gb| NSFHN 5570J4 from 4, EAW94988.1 NSFHN complete se- ubiquitin specific Sbjct 310quence peptidase 48, iso- NSFRN 314 Length = form CRA_b 166797gi|119584494|gb| Score = 224 EAW64090.1 bits (113) solute carrier Expect= 4e− family 6 56 (neurotransmitter Id = 113/113 transporter, GABA),(100%) member 1, isoform Gaps = 0/113 CRA_a (0%) Strand = Plus/Plus 259_C4 gi|22657585| GITGSRPA 12 gi|119623989|gb| Score = 29.9 bits (63),Expect = 0.68 gb|AC09156 WPTW EAX03584.1| Id = 7/7 (100%), Positives= 7/7 (100%), 4.12|Homo cAMP responsive Gaps = 0/7 (0%) sapiens elementbinding Query 6 chromosome protein-like 1, RPAWPTW 12 11, clone isoformCRA_d RPAWPTW RP11-732A19, gi|14250004|gb| Sbjct 312 completeAAH08394.1| RPAWPTW 318 sequence CREBL1 protein Length =gi|119629047|gb| 211735 EAX08642.1| Score = 317 hCG1816309 bits (160)gi|119625915|gb| Expect = 4e− EAX05510.1| 84 homeodomain-only Id= 193/204 protein, isoform (94%) CRA_i Gaps = 3/204 gi|24286115|gb| (1%)AAN46678.1| Strand = hypothetical Plus/Plus protein HGRHSV1gi|21751592|dbj| BAC03997.1| unnamed protein product 28 7_C3gi|11493153| VRLVRTEE 23 gi|55859712|emb| Score = 29.1 bits (61), Expect= 0.89 emb|AL1185 RLELRTRS CA110983.1 Id = 7/7 (100%), Positives = 7/7(100%), 23.18| WNWGLVQ Glucosidase beta 2 Gaps = 0/7 (0%) HSJ1031J8(29.1) Query 15 Human DNA gi|55663314|emb| RSWNWGL 21 sequence fromCAH72550.1|Asp RSWNWGL clone RP5- (abnormal spindle)- Sbjct 215 1031J8like, microcephaly RSWNWGL 221 on chromosome associate Score = 28.6 bits(60), Expect = 1.2 20 gi|55663911|emb| Id = 10/18 (55%), Pos = 14/18(77%), Contains a CAH70187.1|Ino- Gaps = 3/18 (16%) putative sitol1,4,5-tris- Query 1 novel gene, phosphate 3-kinase VRLVRTEERLELRTRSWN 18complete B VRLVRT   +EL T++W+ sequence gi|97536946|sp| Sbjct 226 Length= Q9C0D0|PHAR1_HUMAN VRLVRT---MELLTQNWD 240 155213 Phosphatase and Score= 26.9 bits (56), Expect = 3.9 Score = 1304 actin regulator 1 Id = 10/18(55%), Pos = 12/18 (66%), bits (658) (26.1) Gaps = 6/18 (33%) Expect= 0.0 gi|37779178|gb| Query 3 Id = 707/724 AAO73817.1|LV---RTEERLELRTRSW 17 (97%) HBeAg-binding pro- LV   R+EER   RT+SW Gaps= 2/724 tein RPEL repeat Sbjct 191 (0%) containing 1 LVQGARSEER---RTKSW205 Strand = Neprilysin (Neutral Plus/Plus endopeptidase) (NEP) (26.1)Selectin-like pro- tein Common acute lymphoblastic leu- kemia Agprecursor Ephrin type B re- ceptor 4 precursor (24.4) Tyrosine proteinkinase rcceptor HTK Activated met onco- gene (24) Tpr (translocatedpromoter region to activated MET onco- gene) Serine/threonine proteinphosphatase w/EF-hands-1 Sad1/unc-84-like protein 2 Rab5-interactingprotein 34 9_G1 gi|15668084| VFNCWF 6 gi|41351160|gb| Score = 24.4 bits(50), Expect = 13 2 gb|AC09257 AAH65552.1| Id = 5/6 (83%), Positives= 6/6 (100%), 3.2|Homo IARS Gaps = 0/6 (0%) sapiens BACproteingi|55957316| Query 1 clone RP11- emb|CAI16202.1| VFNCWF 6 107from 2, isoleucine-tRNA VF+CWF complete synthetase Sbjct 413 sequencegi|31874258|emb| VFDCWF 418 Score = 281 CAD98022.1| Score = 20.6 bits(41), Expect = 188 bits (142) hypothetical Id = 4/5 (80%), Positives= 5/5 (100%), Expect = 4e− protein Gaps = 0/5 (0%) 73 gi|32307152|ref|Query 1 Id = 142/142 NP_000907.2| VFNCW 5 (100%) oxytocin receptor VF+CWGaps = 0/142 Sbjct 184 (0%) VFDCW 188 Strand = Plus/Plus 46 1_D1gi|34194579| HLKKKKKK 16 gi|119608512|gb| Score = 30.8 bits (65), Expect= 0.37 1 gb|BC05296 KRGTGXLR EAW88106.1 Id = 9/9 (100%), Positives = 9/9(100%), 3.21|Homo hCG2040846 Gaps = 0/9 (0%) sapiens Snf2-gi|12802996|gb| Query 1 related CBP AAH01198.1| HLKKKKKKK 9 activatorUnknown (protein HLKKKKKKK protein, mRNA for IMAGE: Sbjct 959 (cDNAclone 3355871) HLKKKKKKK 967 IMAGE: gi|119586549|gb| Score 29.9 bits(63), Expect = 0.67 5785548), EAW66145.1 Id = 9/9 (100%), Positives= 9/9 (100%), complete cds zinc finger homeo- Gaps = 0/9 (0%) Length =box 2, Query 3 4805 isoform CRA_d KKKKKKKRG 11 Score = 63.9gi|1703273|sp| KKKKKKKRG bits(32) P50579|AMPM Sbjct 100 Expect = 8e092_HUMAN KKKKKKKRG 108 ld = 32/32 Methionine amino- Score = 29.5 bits(62), Expect = 0.90 (100%) peptidase Id = 9/10 (90%), Positives = 10/10(100%), Gaps = 0/32 2 (MetAP 2) Gaps = 0/10 (0%) (0%) (Peptidase M 2)Query 2 Strand = (Initiation factor LKKKKKKKRG 11 Plus/Minus2-associated 67 kDa LKKKKKKK+G glycoprotein) (P67) Sbjct 490 (p67eIF2)LKKKKKKKKG 499 gi|687243|gb| AAC63402.1| eIF-2-associated p67 homologgi|49616863|gb| AAT67238.1| ubiquitin specific protease 42 50 6_H8gi|4753288| TNSIFGSLE 11 gi|29421174|dbj| Score = 27.4 bits (57), Expect= 2.8 gb|AC004828. SY BAA25472.2 Id = 8/10 (80%), Pos = 9/10 (90%),2|AC004828 KIAA0546 protein Gaps = 0/10 (0%) Homo sapiens (27.4) Query 1clone gi|34098393|sp| TNSIFGSLES 10 DJ0514A23, O95388|WISP TNS+FG LEScomplete se- 1_HUMAN Sbjct 736 quence WNT1 inducible TNSVFGGLES 745Length = signaling pathway Score = 22.7 bits (46), Expect = 98 183249protein 1 precursor Id = 7/10 (70%), Pos = 7/10 (70%), Score = 670(WISP-1) (22.7) Gaps = 0/10 (0%) bits (338), gi|11055594|gb| Query 2Expect = 0.0 AAG28165.1| NSIFGSLESY 11 Id = 351/357 Paraneoplastic NIF  LESY (98%), associated brain- Sbjct 350 Gaps = 0/357 testis-cancerNDIFADLESY 359 (0%) antigen Score = 21.8 bits (44), Expect = 176 Strand= gi|52782735|sp| Id = 6/7 (85%), Positives = 7/7 (100%), Plus/MinusQ7Z628|ARH Gaps = 0/7 (0%) G8_HUMAN Query 4 Neuroepithelial IFGSLES 10cell transforming +FGSLES gene 1 protein Sbjct 240 (p65 Net1 proto-VFGSLES 246 oncogene) (Rho guanine nucle- otide exchange factor 8) SETbinding factor 2 (22.3) Sentrin-specific protease 7 Membrane-associatedtransporter protein Melanoma antigen AIM1 Solute carrier family 2Sentrin-specific protease 7 (SUMO-1 specific protease 2) Nuclear porecomplex pro- tein Nup98-Nup96 precursor 51 9_C1 gi|13277000| PVLSSHKNE18 gi|41150972|ref| Score = 23.1 bits (47), Expect = 52 2 emb|AL1387ARDKGKCH XP_371160.1| Id = 8/17 (47%), Positives = 10/17 (58%), 04.12| PPREDICTED: similar Gaps = 5/17 (29%) Human DNA se- to 60S ribosomalQuery 7 quence from protein L21 KNEARDKG-----KCHP 18 clone RP11-gi|20140092|sp| K EA++KG     KC P 417C20 on Q92552|RT27 Sbjct 116chromosome _HUMAN KKEAKEKGTWVQLKCQP 132 13 Contains Mitochondrial 28SScore = 22.3 bits (45), Expect = 94 the 5′ end of ribosomal protein Id= 7/9 (77%), Positives = 8/9 (88%), gene KIAA1016 S27 (S27mt) Gaps = 0/9(0%) and a CpG is- (MRP-S27) Query 5 land, com- gi|62906888|sp|SHKNEARDK 13 plete se- Q86T65|DAA SHK EAR+K quence M2_HUMAN Sbjct 37Length = Disheveled associ- SHKWEAREK 45 165120 ated activator of Score= 22.3 bits (45), Expect = 129 Score = 236 morphogenesis 2 Id = 6/6(100%), Positives = 6/6 (100%), bits (119), Adapter-related Gaps = 0/6(0%) Expect = 3e− protein complex 4 Query 8 59 mu 1 subunit Se- NEARDK13 Identities = ven transmembrane NEARDK 121/122 helix receptor Sbjct932 (99%), gi|29164873|gb| NEARDK 937 Gaps = 0/122 AAO65168.1| (0%)sarcoma antigen Strand = NY-SAR-27 Plus/Minus gi|739072|prf| 2002263AE1A-associated protein gp130 gi|397148|emb| CAA52671.1| Rb2|p130 proteingi|6686330|sp| Q08999|RBL2_(—) HUMAN Retinoblastoma-like protein 2; Rb-related p130 protein Proteasome subunit beta type 4 precursor G1 to Sphase transition protein 1 homolog Cytochronie p450 53 9_FD1gi|15055218| PAGISRELV 17 gi|6636|226|pdb| Score = 29.5 bits (62),Expect = 0.89 2 gb|AC06022 DKLAAALE 1YX5|B Id = 9/9 (100%), Positives= 9/9 (100%), 6.39|Homo Chain B, Solution Gaps = 0/9 (0% sapiens 12Structure Of S5a Query 9 BAC RP11- Uim-1UBIQUITIN VDKLAAALE 17 101P14COMPLEX Sbjct 84 (Roswell Park gi|118595723|sp| VDKLAAALE 92 CancerInsti- Q9P212|PLC Score = 24.8 bits (51), Expect = 23 tute HumanE1_HUMAN Id = 7/9 (77%), Positives = 9/9 (100%), BAC Library)1-phosphatidylino- Gaps = 0/9 (0%) complete se- sitol-4,5-bisphos- Query2 quence phate AGISRELVD 10 Length = phosphodiesterase AGIS+EL+D 132604epsilon 1 (Phospho- Sbjct 516 Score = 36.2 lipase C-epsilon-1) AGISKELID524 bits (18) (Phosphoinositide- Score = 22.7 bits (46), Expect = 99Expect = 5.6 specific phospho- Id = 6/8 (75%), Positives = 8/8 (100%),Id = 18/18 lipase C epsilon-1) Gaps = 0/8 (0%) (100%) (Pancreas-enrichedQuery 5 Gaps = 0/18 phospholipase C) SRELVDKL 12 (0%) gi|119570428|gb|SR+LV+KL Strand = EAW50043. Sbjct 284 Plus/Plus phospholipase C,SRDLVNKL 291 epsilon 1, isoform CRA_b gi|119620997|gb| EAX00592.1|eukaryotic transla- tion initiation factor 2B, subunit 4 delta, 67kDa,isoform CRA_f gi|90185247|sp| Q86W92|L1PB 1_HUMAN Liprin-beta-1 (Pro-tein tyrosine phos- phatase receptor type f polypeptide- interactingpro- tein-binding pro- tein 1) (PTPRF-interacting protein-bindingprotein 1) (hSGT2) gi|33337571|gb| AAQ13438.1| AF057699_1EIF-2B-delta-like protein 54 10_B gi|6449479| HTQGCLPM 23gi|61214481|sp| Score = 31.2 bits (66), Expect = 0.27 12 gb|AC009510.ACAASDSPA Q81ZE3|PACE Id = 9/10 (90%), Positives = 9/10 (90%), 9| CVVCSH1_HUMAN Gaps = 0/10 (0%) Homo sapiens Protein-associating Query 1412p11-37.2- with the carboxyl- DSPACVVCSH 23 54.4 BAC terminal domain ofDSP CVVCSH RP11- ezrin (Ezrin-bind- Sbjct 438 1110J8 ing protein PACE-1)DSPMCVVCSH 447 (Roswell Park (SCY1-like protein Score = 26.5 bits (55),Expect = 6.7 Cancer In- 3) Id = 11/19 (57%), Positives = 12/19 (63%),stitute Human gi|41688566|sp| Gaps = 3/19 (15%) BAC Library)P60329|KR124 Query 1 complete se- _HUMAN HTQGCLPMACAASDSPACV 19 quenceKeratin-associated H+ GC PMAC    SP CV Length = protein 12-4 Sbjct 6145859 (High sulfur kera- HSSGC-PMACPG--SPCCV 21 Score = 676tin-associated Score = 24.4 bits (50), Expect = 21 bits (341) protein12.4) Id = 8/11 (72%), Positives = 9/11 (81%), Expect = 0.0gi|51468842|ref| Gaps = 1/11 (9%) Id = 358/366 XP_374912.2| Query 6(97%) PREDICTED: X-ray LPMACAA-SDS 15 Gaps = 0/366 radiation re- LPMAC AS+S (0%) sistance associated Sbjct 482 Strand = 1 TGF-beta LPMACPALSES492 Plus/Minus resistance-asso- ciated protein TRAG gi|73920974|sp|Q9Y4E6|WDR 7_HUMAN WD-repeat protein 7 (TGF-beta resis- tance-associatedprotein TRAG) (Rabconacctin-3 beta) 6-phosphofructo- kinase type CSignal recognition particle 19 kDa Gastric mucin Con- tactin associatedprotein like 3 precursor Cell recognition molecule Caspr3gi|57015409|sp| Q8IWT3|PAR C_HUMAN p53-associated parkin-likecytoplasmic protein (UbcH7 associated protein 1) 57 5_C4 gi|14245766|VQKSGWGL 9 gi|33514905|sp| Score = 21.0 bits (42), Expect = 30dbj|AP00245 A Q9P2R3|ANF Id = 6/8 (75%), Pos = 6/8 (75%), 6.3|HomoY1_HUMAN Gaps = 0/8 (0%) sapiens Ankyrin repeat and Query 1 genomic DNA,FYVE domain protein VQKSGWGL 8 chromosome 1 (21) V KSGW L 11qgi|22507380|ref| Sbjct 285 clone: RP11- NP_683766.1 VDKSGWSL 292 956A8com- G protein-coupled Score = 21.0 bits (42), Expect = 210 plete se-receptor Id = 5/6 (83%), Positives = 6/6 (100%), quence gi|32171177|ref|Gaps = 0/6 (0%) Length = NP_037529.2 Query 1 114109 Over-expressedVQKSGW 6 Score = 961 breast tumor +QKSGW bits(485) protein Sbjct 203Expect = 0.0 Adaptor-related IQKSGW 208 Id = 485/485 protein complex(100%) (21) Gaps = 0/485 mRNA decapping (0%) enzyme 2 DCP2 Strand =decapping enzyme Plus/Plus (Nudix motif 20) (20.2) Superoxide dismutase,mito- chondrial precursor Sterol regulatory element binding proteincleavage activating protein Kallikrein 4 pre- cursor (Prostase) 62 11_Cgi|18450186| KKKKKGV 8 gi|3264861|gb| Score = 23.5 bits (48), Expect= 45 7 gb|AC09316 G AAC78729.1| Id = 7/7 (100%), Positives = 7/7 (100%),8.3|Homo eukaryotic transla- Gaps = 0/7 (0%) sapiens BAC tion initiationQuery 1 clone RP11- factor eIF3, p35 KKKKKGV 7 148M21 from subunitKKKKKGV 7, complete gi|119597676|gb| Sbjct 220 sequence EAW77270.1KKKKKGV 226 Length = eukaryotic transla- Score = 21.0 bits (42), Expect= 262 148689 tion initiation Id = 7/8 (87%), Positives = 7/8 (87%),Score = 58.0 factor 3, subunit 1 Gaps = 0/8 (0%) bits (29) alpha, 35kDa, Query 1 Expect = 1e− isoform KKKKKGVG 8 05 gi|119586121|gb| KKKKKGG Id = 41/45 EAW65717. Sbjct 723 (91%) cyclin-dependent KKKKKGGG 730Gaps = 0/45 kinase-like 1 Score = 20.6 bits (41), Expect = 351 (0%)(CDC2-related Id = 6/6 (100%), Positives = 6/6 (100%), Strand = kinase),isoform Gaps = 0/6 (0%) Plus/Minus CRA_d Query 1 gi|18087335|gb| KKKKKG6 AAL58838.1| KKKKKG AF390028_1 Sbjct 249 serine/threonine KKKKKG 254protein kinase kkialre- like 1 gi|119625903|gb| EAX05498.1| signalrecognition particle 72 kDa, isoform CRA_c gi|119618653I|gb| EAW98247.1calcium/calmodulin- dependent protein kinase kinase 2, beta, isoformgi|119609183|gb| EAW88777.1 chromodomain helicase DNA binding protein 4Length = 1911 gi|119589505|gb| EAW69099.1 general transcrip- tion factorIIF, polypeptide 1, 74 kDa, isoform CRA_b 63 2_G3 gi|9801555| LMLPGLSL20 gi|49522560|gb| Score = 26.5 bits (55), Expect = 7.0 emb|AL07933PGTLGVRG AAH73937.1| Id = 16/38 (42%), Positives = 16/38 (42%),5.29|Human SLSK IGKC protein Gaps = 19/38 (50%) DNA sequencegi|18676842|dbj| Query1 from clone BAB85036.1|LML--PG-----------LSLPGTLG------VRGSLS 19 RP1-132F21 unnamed proteinLML  PG           LSLP TLG       R SLS on chromosome product Sbjct11 20,complete gi|23111007|ref| LMLWVPGSSGDVVMTQSPLSLPVTLGQPASISCRSSLS 48sequence NP_076957.3| Score = 26.1 bits (54), Expect = 9.3 Length =hypothetical Id = 8/9 (88%), Positives = 8/9 (88%), 71117 protein Gaps= 0/9 (0%) Score = 196 LOC79018 Query 4 bits (99) gi|73620594|sp|PGLSLPGTL 12 Expect = 4e− Q81VV7|CQ03 PGLSLP TL 48 9_HUMAN Sbjct 52Identities = Uncharacterized PGLSLPATL 60 99/99 (100%) protein C17orf39Score = 25.7 bits (53), Expect = 13 Gaps = 0/99 gi|27469499|gb| Id= 9/12 (75%), Positives = 10/12 (83%), (0%) AAH41829.1| Gaps = 0/12 (0%)Strand = Chromosome 17 Query 1 Plus/Minus open reading LMLPGLSLPGTL 12frame 39 L+L GL LPGTL gi|119623763|gb| Sbjct 4 EAX03358.1| LLLAGLLLPGTL15 corneodesmosin Score = 25.2 bits (52), Expect = 17 gi|45477124|sp| Id= 8/9 (88%), Positives = 8/9 (88%), Q86UW6|N4B Gaps = 0/9 (0%) P2_HUMANQuery 3 NEDD4-binding LPGLSLPGT 11 protein 2 LPGL LPGT (BCL-3-bindingSbjct 297 protein) LPGLDLPGT 305 65 10_F gi|18449730| QIMRSG 7gi|113420304|ref| Score = 26.1 bits (54), Expect = 6.7 10 gb|AC09269 VXP_001127830.1| Id = 7/7 (100%), Positives = 7/7 (100%), 4.6|HomoPREDICTED: similar Gaps = 0/7 (0%) sapiens 3q to piwi-like 2 Query 1 BACgi|98990269|gb| QIMRSGV 7 RP11-172A10 ABF60230.1| QIMRSGV (Roswell ParkSARP Sbjct 548 Cancer Insti- gi|116241248|sp| QIMRSGV 554 tute HumanQ8N9B4|AN Score = 21.8 bits (44), Expect = 128 BAC Library) R42_HUMAN Id= 7/8 (87%), Positives = 7/8 (87%), complete se- Ankyrin repeat Gaps= 1/8 (12%) quence domain-containing Query 1 Length = protein 42QIM-RSGV 7 151549 gi|119607118|gb| QIM RSGV Score = 444 EAW86712.1 Sbjct115 bits (224) protein-L- QIMLRSGV 122 Expect = 2e− isoaspartate (D-Score = 20.6 bits (41), Expect = 308 122 aspartate) O- Id = 5/6 (83%),Positives = 6/6 (100%), Id = 224/224 methyltransferase Gaps = 0/6 (0%)(100%) domain containing Query 1 Gaps = 0/224 1, isoform QIMRSG 6 (0%)gi|119580466|gb| QIMR+G Strand = EAW60062.1 Sbjct 21 Plus/Minus MCM5minichromosome QIMRTG 26 maintenance defic- ient 5, cell divi- sioncycle 46 (S. cerevisiae), iso- form CRA_c gi|27552808|gb| AAH42923.1|Talin 1 gi|546571|gb| AAB30677.1| Wnt4 product [human, breast celllines, Peptide Partial, 120 aa] 71 9_H4 gi|8748861| SRYW 4gi|122892392|gb| Score = 18.9 bits (37), Expect = 571 gb|AC019181.ABM67263.1 Id = 4/4 (100%), Positives = 4/4 (100%), 4|Homoimmunoglobulin Gaps = 0/4 (0%) sapiens BAC heavy chain Query 1 cloneRP11- variable region SRYW 4 272E3 from gi|119608878|gb| SRYW 2,complete EAW88472.1 Sbjct 995 sequence G protein-coupled SRYW 998 Length= receptor 112, 190998 isoform CRA_b Score = 240 gi|119599986|gb| bits(121) EAW79580.1 Expect = 1e− immunoglobulin 60 superfamily, member Id= 121/121 11, isoform CRA_b (100%) gi|119591313|gb| Gaps = 0/121EAW70907.1 (0%) thyroid hormone Strand = receptor interactor Plus/Minus12, isoform CRA_l gi|119588597|gb| EAW68191.1 F-box protein 3, isoformgi|110735406|ref| NP_573438.2|pro- tein tyrosine phosphatase, receptortypeU 78 2_C1 Homo sapiens TNGSKK 11 gi|62906863|sp| Score = 21.8 bits(44), Expect = 127 1 chromosome 8, EKKLXF P30084|ECHM Identities = 8/10(80%), Positives = 9/10 clone LXXXXK _HUMAN (90%), Gaps = 0/10 (0%)RP11-35G22, Enoyl-CoA Query 1 complete hydratase, mito- TNGSKKEKKL 10sequence chondrial precursor T GSKK+KKL Score = 40.1 (Short chain enoyl-Sbjct 1346 bits (20) CoA hydratase) TRGSKKKKKL 1355 Expect = 0.99 (SCEH)(Enoyl-CoA Score = 21.4 bits (43), Expect = 170 Identities =hydratase 1) Identities = 6/6 (100%), Positives = 6/6 20/20 (100%)gi|51702191|sp| (100%), Gaps = 0/6 (0%) Q9H2Y7|ZF10 Query 5 6_HUMANKKEKKL 10 Zinc finger protein Sbjct 312 106 KKEKXL 317 homolog (Zfp-106)Score = 21.4 bits (43), Expect = 170 gi|57284109|emb| Identities = 9/20(45%), Positives = 9/20 CAI43036.1| (45%) TAF7-Iike RNA Gaps = 11/20(55%) polymerase II, TATA Query 1 box binding TNG-----------SKKEKK 9protein (TBP)- TNG           SKKEKK associated Sbjct 10 factor, 50 kDaTNGQPDQQAAPKAPSKKEKK 29 gi|13603873|gb| Score = 21.0 bits (42), Expect= 229 AAK31974.1| Identities = 6/6 (100%), Positives = 6/6TBP-associated (100%), Gaps = 0/6 (0%) factor II Q Query 4 gi|401357|sp|SKKEKK 9 Q02004|VGLM_(—) Sbjct 185 DUGBV SKKEKK 190 M polyproteinprecursor /Contains: Glyco- protein G2; Non- structural protein NS-M;Glycoprotein G1| gi|6671334|eb| AAF23161.1| disabled-2 gi|1706487|sp|P98082|DAB2_(—) HUMAN Disabled homolog 2 (Differentially expressedprotein 2) (DOC-2) gi|1706487|sp| P98082|DAB2_(—) HUMAN Disabled homolog2 (Differentially expressed protein 2) (DOC-2) gi|1110539|gb|AAB19032.1| mitogen-responsive phosphoprotein gi|56202443|emb|CAI20904.1| myeloidVlymphoid or mixed-lineage leukemia (trithoraxhomolog, Drosophila)\; translocated to, 4 gi|23306684|gb| AAN15215.1|multiple myeloma transforming gene 2 82 10_D gi|18693518| GETPELITT 22Similar to RIKEN Score = 24.8 bits (51), Expect = 16 7 gb|AC01591NTSQLNFR cDNA 9130427A09 Id = 7/7 (100%), Positives = 7/7 (100%),1.8|Homo KQIVP gene (24.8) Gaps = 0/7 (0%) sapiens gi|23512334|gb| Query7 chromosome AAH38451.1| ITTNTSQ 13 gi|18693518| Similar to RIKENITTNTSQ gb|AC01591 cDNA 9130427A09 Sbjct 150 1.8|Homo gene ITTNTSQ 156sapiens gi|60593095|ref| Score = 24.0 bits (49), Expect = 29 chromosomeNP_001012660.1| Id = 10/19 (52%), Positives = 12/19 (63%), 17, clonehypothetical Gaps = 6/19 (31%) RPII-1094M14, protein Query 1 completese- LOC196996 GETPELITTNTSQLNF-RK 18 quence gi|31566169|gb|G+TP      TS+LNF RK Length = AAH53647.1| Sbjct 185 181561 HypotheticalGQTPA-----TSELNFLRK 198 Score = 496 protein Score = 24.0 bits (49),Expect = 29 bits (250) LOC84978, isoform 1 Ids = 7/8 (87%), Positives= 7/8 (87%), Expect = 6e− gi|22478057|gb| Gaps = 0/8 (0%) 138AAH36900.1| Query 5 Id = 250/250 ZWILCH protein ELITTNTS 12 (100%)gi|38348727|ref| ELITTN S Gaps = 0/250 NP_071348.3| Sbjct 40 (0%)thyroid adenoma ELITTNNS 47 Strand = associated isoform Plus/Plus 1gi|34493758|gb| AAO46785.1| death receptor in- teracting proteingi|57162636|emb| CA139911.1| solute carrier family 7 (cationic aa trans-porter, y+ system), member 1 gi|13177667|gb| AAH03618.1| Tara-likeprotein, isoform 1 gi|20455324|sp| Q9H2D6|TAR A_HUMAN TRIO and F-actinbinding protein (Protein Tara) gi|252582|gb| AAB22747.1| IFN-tyk, tyk2 =interferon alpha| beta signaling pathway-related protein tyrosine kinasegi|56405328|sp| P29597|TYK2 _HUMAN Non-receptor tyrosine-protein kinaseTYK2 84 10_E gi|11493153| PRMRWGX 23 gi|119578750|gb| QScore = 29.1 bits(61), Expect = 1.2 12 emb|AL1185 XXVAXWPV EAW58346.1 Id = 7/7 (100%),Positives = 7/7 (100%), 23.18| PSHW glucosidase, beta Gaps = 0/7 (0%)Human DNA se- (bile acid) 2, Query 15 quence from isoform CRA_b RSWNWGL21 clone RP5- gi|14042883|dbj| RSWNWGL 1031J8 BAB55430.1| Sbjct 215 onchromosome unnamed protein RSWHWGL 221 20, complete product Score = 28.6bits (60), Expect = 1.6 sequence gi|119611678|gb| Id = 10/18 (55%),Positives = 14/18 (77%), Length = EAW91272.1 Gaps = 3/18 (16%) 155213asp (abnormal Query 1 Score = 1449 spindle)-like, VRLVRTEERLELRTRSWN 18bits (731) microcephaly VRLVRT   +EL T++W+ Expect = 0.0 associated Sbjct241 Id = 767/777 gi|62906885|sp| VRLVRT---MELLTQNWD 255 (98%)P27987|IP3KB Score = 26.9 bits (56), Expect = 5.1 Gaps = 0/777 _HUMANInosito- Id = 10/18 (55%), Positives = 12/18 (66%), (0%) trisphosphate3- Gaps = 6/18 (33%) Strand = kinase_B Query 3 Plus/Plusgi|21361517|ref| LV---RTEERLELRTRSW 17 NP_056937.2| LV   R+EER   RT+SWSAPK substrate Sbjct 191 protein 1 LVQGARSEER---RTKSW 205gi|54144631|ref| Score = 26.5 bits (55), Expect = 6.9 NP_112210.1| Id= 9/12 (75%), Positives = 9/12 (75%), phosphatase and Gaps = 2/12 (16%)actin regulator 1 Query 9 gi|70888311|gb| RLELRTRSWNWG 20 AAZ13758.1|RLE RTR  NWG leukocyte specific Sbjct 286 transcript 1 RLETRTR--NWG 295gi|37222213|gb| Score = 26.1 bits (54), Expect = 9.3 AAQ89957.1| Id= 7/7 (100%), Positives = 7/7 (100%), selectin-like Gaps = 0/7 (0%)protein Query 7 gi|68655017|emb| EERLELR 13 CAF04067.1 EERLELR SEL-OBprotein Sbjct 449 gi|119601416|gb| EERLELR 455 EAW81010.1 Score = 26.1bits (54), Expect = 9.3 splicing factor, Id = 8/11 (72%), Positives= 8/11 (72%), arginine/serine- Gaps = 3/11 (27%) rich 5, isoform Query 7CRA_e EERLELRTRSW 17 EERLE   RSW Sbjct 31 EERLE---RSW 38 86 11_Cgi|15808543| REMRLKNT 40 gi|21756842|dbj| Score = 45.6 bits (100),Expect = 1e−05 3 gb|AC09307 KLQSDKRN BAC04969.1| Id = 17/22 (77%),Positives = 17/22 (77%), 3.2|Homo NFGPGAVV unnamed protein Gaps = 0/22(0%) sapiens HTCNPSTSG product Query 19 chromosome GXVGRITgi|21389158|gb| GPGAVVHTCNPSTSGGXVGRIT 40 19 clone AAM50513.1| GPGAV HCNPST GG  GRIT LLNLR-275E5, hypothetical Sbjct 136 complete se- proteinGPGAVAHACNPSTLGGRGGRIT 157 quence gi|20139105|sp| Score = 44.3 bits(97), Expect = 4e−05 Length = Q99959|PKP2 Id = 16/21 (76%), Positives= 16/21 (76%), 2616 _HUMAN Plakophilin- Gaps=0/21 (0%) Score = 319 2Query 20 bits (161) gi|15929032|gb| PGAVVHTCNPSTSGGXVGRIT 40 Expect= 2e− AAH14974.1| PG VVH CNPST GG  GRIT 84 Yip1 interacting Sbjct 17 Id= 180/189 factor homolog B, PGTVVHACNPSTLGGQGGRIT 37 (95%) isoform 1Score = 43.9 bits (96), Expect = 6e−05 Gaps = 0/189 gb|2088182|dbj| Id= 19/27 (70%), Positives = 19/27 (70%), (0%) BAD92538.1| Gaps = 3/27(11%) Strand = SLC2A11 piotein Query 15 Plus/Minus variantRNNFG-PGAVVHTCNPSTSGGXVGRIT 40 gi|66932986|ref| RN  G PGAV H CNPSTGG  GRIT NP_00101938 Sbjct 468 6.1|filamin- RN--GWPGAVAHACNPSTLGGQGGRIT492 binding LIM protein-1 isoform b gi|3335138|gb| AAC39892.1| RNApolymerase 1 40 kD subunit gi|7416053|dbj| BAA93676.1| survivin-betagi|386941|gb| AAA59814.1| MHC HLA-DR-beta-1 chain gi|114849|sp|P20931|YYY3_(—) HUMAN Very very hypothetical B-cell growth factor(BCGF-12 kDa) gi|48474670|sp| Q9NRR6|1NP5 _HUMAN (Phosphatidyl-inositol-4,5- bispHosphate 5- phosphatase) gi|20455217|sp|O43353|RIPK2_HUMAN Receptor-interact- ing serine/ threonine-proteinkinase 2 88 5_C7 gi|22657585| GITGSRPA 12 gi|119629047|gb| Score = 32.5bits (69), Expect = 0.12 gb|ACO9156 WPTW EAX08642.1| Id = 8/8 (100%),Positives = 8/8 (100%), 4.12|Homo hCG1816309 Gaps = 0/8 (0%) sapienschro- gi|119625915|gb| Query 5 mosome 11, EAX05510.1| SRPAWPTW 12 clonehomeodomain-only SRPAWPTW RP11-732A19, protein, isoform Sbjct 18complete se- CRA_i SRPAWPTW 25 quence gi|55961994|emb| Score = 29.9 bits(63), Expect = 0.68 Length = CA118336.1| Id = 7/7 (100%), Positives= 7/7 (100%), 211735 cAMP responsive Gaps = 0/7 (0%) Score = 321 elementbinding Query 6 bits (162) protein-like 1 RPAWPTW 12 Expect = 2e−gi|30046775|gb| RPAWPTW 85 AAH50548.1| Sbjct 312 Id = 194/204 KIF4Aprotein RPAWPTW 318 (95%) gi|119625748|gb| Score = 28.2 bits (59),Expect = 2.2 Gaps = 3/204 EAX05343.1| Id = 7/8 (87%), Positives = 7/8(87%), (1%) kinesin family Gaps = 0/8 (0%) Strand = member 4A, Query 5Plus/Plus isoform CRA_b SRPAWPTW 12 gi|119603999|gb| SRPAW TW EAW83593.1Sbjct 1110 NADH dehydrogenase SRPAWATW 1117 (ubiquinone) 1 alphasubcomplex, 5, 13 kDa, isoform CRA_d 89 11_A gi|15789217| NSSA 4gi|125950975|sp| Score = 14.2 bits (26), Expect = 14482 1 gb|AC08481Q9Y4F3|LK Id = 4/4 (100%), Positives = 4/4 (100%), 9.17|Homo AP_HUMANLimkain-b1 Gaps = 0/4 (0%) sapiens 12p gi|124001558|ref| Query 1 BACRPII- NP_689799.3| NSSA 4 449P1 ubiquitin specific NSSA (Roswell Parkprotease 54 Sbjct 90 Cancer Insti- gi|123242853|emb| NSSA 93 tute HumanCAI16881.2 BAC Library) quiescin Q6-like 1 complete se- gi|120660404|gb|quence AAI30523.1| Length = Receptor tyrosine 46530 kinase-like Score= 418 orphan receptor 2 bits (211) gi|120659834|gb| Expect = 1e−AAI30364.1| 114 BRCC2 Id = 216/218 gi|114432132|gb| (99%) AB174674.1|Gaps = 0/218 breast and ovarian (0%) cancer susceptibil- Strand = ityprotein 2 Plus/Minus truncated variant gi|119628905|gb| EAX08500.1|breast cancer 2, early onset, gi|119625696|gb| EAX05291.1| TAF1 RNApolymerase 11, TATA box bind- ing protein (TBP)- associated factor, 250kDa, isoform CRA_a gi|119626158|gb| EAX05753.1| protein phospha- tase,EF-hand calcium binding domain 2, isoform gi|119625726|gb| EAX05321.1|myeloid/lymphoid or mixed-lineage leukemia (trithorax homolog,Drosophila); trans- located to, 7, isoform 97 9_C2 gi|11875985| NSNIY 5gi|119615275|gb| Score = 20.2 bits (40), Expect = 295 emb|AL0968EAW94869.1| Id = 5/5 (100%), Positives = 5/5 (100%), 68.15|Human zincfinger, UBR1 Gaps = 0/5 (0%) DNA sequence type 1, isoform Query 1 fromclone CRA_c NSNIY 5 RP3-493D19 on gi|82659109|ref| NSNIY chromosomeNP_065816.2| Sbjct 356 6q14.3-16.1, retinoblastoma- NSNIY 360 completese- associated factor quence 600 Length = gi|19070472|gb| 151761AAL83880.1| Score = 757 AF348492_1 p600 bits (382) gi|33319488|gb|Expect = 0.0 AAQ05647.1| Id = 385/386 AF471472_1 Ig (99%) heavy chainGaps = 0/386 variable region, (0%) VH3 family Strand = gi|30046973|gb|Plus/Minus AAH50539.1| Solute carrier fam- ily 14 (urea trans- porter),member1 (Kidd blood group) 99 6_C1 gi|20198509| LTLRTKTT 12gi|89033971|ref| Score = 24.4 bits (50), Expect = 23 2 gb|AC01345 TVTVXP_931830.1 Id = 7/7 (100%), Pos = 7/7 (100%), 2.0|Homo PREDICTED:Hypothe- Gaps = 0/7 (0%) sapiens tical protein Query 2 chromosome 15XP_931830 TLRTKTT 8 clone RP11- gi|55663026|emb| TLRTKTT 325E5 mapCAH172389.1| Sbjct 134 15q21.1, com- Protocadherin 15 TLRTKTT 140 pletegi|3070500|gb| Score = 23.1 bits (47), Expect = 56 sequence AAH51787.1|Id = 8/9 (88%), Pos = 8/9 (88%), Length = Death effector Gaps = 1/9(11%) 204745 filament-forming Query 2 Score = 831 Ced-4-like TLRT-KTTT 9bits (419) apoptosis protein TLRT KTTT Expect = 0.0 gi|1916615|gb| Sbjct610 Id = 446/452 AAC51239.1| TLRTSKTTT 618 (98%) Ribosomal RNA Score= 22.3 bits (45), Expect = 96 Gaps = 3/452 upstream binding Id = 7/9(77%), Positives = 8/9 (88%), (0%) transcription Gaps = 0/9 (0%) Strand= factor Polycystic Query 4 Plus/Minus kidney and hepatic RTKTTTVTV 12disease 1 CCM2 RT TTT+TV protein PP10187 Sbjct 245 Caspase recruitmentRTTTTTLTV 253 domain protein 7 10 10_B gi|15705226| TRGTKRSW 12gi|30983668|gb| Score = 25.2 bits (52), Expect = 12  0 10 emb|AL3908VHSF AAP41104.1| Identities = 8/11 (72%), Positives = 9/11 76.19| Cohensyndrome 1 (81%), Gaps = 1/11 (9%) Human DNA se- protein splice Query 1sequence from variant 2 TRGTKRSWYHS 11 clone RP11- gi|77416860|sp| TR+KR WYHS 394D2 on Q92674|FSHP Sbjct 18 chromosome 9, 1_HUMAN TRTSKR-WYHS27 complete se- Follicle-stimulat- Score = 24.4 bits (50), Expect = 377quence ing hormone primary Identities = 7/8 (87%), Positives = 7/8Length = response protein (87%) 185571 (FSH primary re- Query: 2 Score= 682 sponse protein 1) GTKRSWYH 9 bits (344), gi|256180|gb| GT RSWYHExpect = 0.0 AAB23392.1| Sbjct: 1098 Ide = 365/365 cancer-associatedGTVRSWYH 1105 (100%) retinopathy antigen Score = 22.3 bits (45), Expect= 96 Gaps = 0/365 gi|464600|sp| Identities = 6/8 (75%), Positives = 6/8(0%) P35243|RECO_HUMAN (75%), Strand = Recoverin (Cancer Gaps = 0|8 (0%)Plus/Minus associated retino- Query 2 pathy protein) RGTKRSWY 9 (CARprotein) RG K SWY gi|48428608|sp| Sbjct 712 Q12789|TF3C RGKKWSWY 7191_HUMAN General transcrip- tion factor 3C polypeptide 1 (Transcriptionfactor IIIC-alpha subunit) (TF3C- alpha) (TFIIIC 220 kDa subunit)(TFIIIC220) (TFIIIC box B-binding subunit) gi|18375626|ref| NP_542417.11HLA-B associated transcript-2 isoform a gi|4337110|gb| AAD18O86.1| BAT2gi|55662017|emb| CAH70921.1| v-abl Abelson murine leukemia viraloncogene homolog 2 10 12_C gi|10440580| FFVFYLQS 20 gi|12803135|gb|Score = 27.8 bits (58), Expect = 2.2  1 4 gb|AC01554 RIMTDTKI AAH0237.1|Id = 11/25 (44%), Pos = 14/25 (56%), 2.17|Homo SPLH Nipsnap homolog 1Gaps = 7/25 (28%) sapiens 3 BAC (27.8) Query 3 RP11-38B6 gi|8216987|emb|VFY-------LQSRIMTDTKISPLH 20 (Roswell Park CAB92443.1|V+Y       ++SRIM   KISPL Cancer Insti- Putative tumor Sbjct 260 tuteHuman antigen Sarcoma VYYTVPLVRHMESRIMIPLKISPLQ 284 BAC Library) antigen1 Score = 26.5 bits (55), Expect = 5.3 complete se- gi|24419041|gb| Id= 9 12 (75%), Positives = 10 12 (83%), quence, AAL65133.2| Gaps = 1/12(8%) Length = Ovarian cancer Query 9 15150 related tumor RI-MTDTKISPL 19Score = 252 marker CA125 RI MTDT ISP+ bits (127) Sbjct 229 Expect = 1e−RINMTDTGISPM 240 64 Score = 23.5 bits (48), Expect = 41 Id = 137/139 Id= 8/10 (80%), Positives = 8/10 (80%), (98%) Gaps = 1/10 (10%) Gaps= 1/139 Query 10 (0%) IMTDT-KISP 18 Strand = IMTDT KI P Plus/Plus Sbjct9801 IMTDTEKIHP 9810 10 2_C7 none PCEMELESP 16 gi|119622175|gb| Score= 28.6 bits (60), Expect = 1.6  2 HPAXCHL EAX01770.1| Id = 11/16 (68%),Positives = 12/16 (75%), hCG1986848 Gaps = 4/16 (25%) gi|23396460|sp|Query 4 O005121|BCL9_HUMAN MELES-PH--PAXCHL 16 B-cell lymphoma 9 MELESPH  P+ CHL protein (Bcl-9) Sbjct 21 (Legless hoinolog) MELESMPHSVPS-CHL35 gi|119569780|gb| Score = 25.7 bits (53), Expect = 13 EAW49395.1 Id= 8/11 (72%), Positives = 10/11 (90%), G protein-coupled Gaps = 1/11(9%) receptor kinase 5, Query 3 isoform CRA_b EMEGPS-PHPA 12gi|23822155|sp| EMEGP+ P+PA Q9BX26|SYCP2_HUMAN Sbjct 572 Synaptonemalcom- EMEGPNVPNPA 582 plex protein 2 Score = 24.4 bits (50), Expect = 31(SCP-2) Id = 6/9 (66%), Positives = 8/9 (88%), gi|98991767|ref| Gaps= 0/9 (0%) NP_690000.2| Query 8 mitogen-activated SPHPAXCHL 16 proteinkinase SPHP+ CH+ kinase kinase 7 Sbjct 31 interacting SPHPSLCHM 39protein 3 Score = 24.0 bits (49), Expect = 41 gi|37538058|gb| Id = 9/13(69%), Positives = 9/13 (69%), AAQ92939.1| Gaps = 4/13 (30%) NFkBactivating Query 1 protein 1 GC---EM-EGQSP 9 gi|3182968|sp| GC   EMEGQSP O15528|CP27B Sbjct 372 _HUMAN GCLIYEMIEGQSP 384 25-hydroxyvitaminScore = 24.0 bits (49), Expect = 41 D-1 Id = 6/7 (85%), Positives = 7/7(100%), alpha hydroxylase, Gaps = 0/7 0%) mitochondrial pre- Query 4cursor (Cytochrome MELESPH 10 P450 subfamily ME+ESPH XXVIIB polypeptideSbjct 1230 1) (Cytochrome p450 MEIESPH 1236 27B1) (Calcidiol 1-monooxygenase) (25- hydroxyvitamin D(3) 1-alpha-hydroxyl- ase) (VD3 1Ahy- droxylase) (P450C1 alpha) (P450VD1- alpha) gi|119620981|gb|EAX00576.1| keratinocyte asso- ciated protein 3 10 8_B1 gi|15789217|NSSA 4 gi|124375836|gb| Score = 14.2 bits (26), Expect = 14482  3 2gb|AC08481 AAI32680.1| Identities = 4/4 (100%), Positives = 4/49.17|Homo Transmembrane (100%) sapiens 12p channel-like 3 Gaps = 0/4(0%) BAC RP11- gi|124001558|ref| Query 1 449P1 NP_689799. NSSA 4(Roswell Park ubiquitin specific NSSA Cancer Insti- protease54 Sbjct1178 tute Human gi|120660404|gb| NSSA 1181 BAC Library) AAI30523.1|Length = Receptor tyrosine 46530 kinase-like orphan Score = 480 receptor2 bits (242) gi|120660344|gb| Expect = 1e− AAI30362.1| 132 BRCC2 Id= 251/255 gi|114432132|gb| (98%) ABI74674.1| Gaps = 0/255 breast andovarian (0%) cancer susceptibil- Strand = ity protein 2 Plus/Minustruncated variant gi|119628905|gb| EAX08500.1| breast cancer 2, earlyonset, iso- form CRA_b gi|119620918|gb| EAX00513.1| restin-like 2,isoform CRA_e gi|119625696|gb| EAX05291.1| TAF1 RNA polymerase II, TATAbox bind- ing protein 10 5_G4 gi|2688799| GWQERRN 11 gi|l19600046|gb|Score = 23.1 bits (47), Expect = 75  5 gb|AC003681. KLTK EAW79640.1 Id= 6/7 (85%), Positives = 7/7 (100%), 1|AC003681 WD repeat domain Gaps= 0/7 (0%) Human PAC 52, isoform CRA_a Query 3 clone RP3-gi|112180716|gb| QERRNKL 9 394A18 from AAH30606.2 +ERRNKL 22q12.1-qter,Coiled-coil domain Sbjct 704 complete se- containing 11 EERRNKL 710quence gi|49176521|gb| Score = 23.1 bits (47), Expect = 75 Length =AAT52215.1| Id = 6/9 (66%), Positives = 8/9 (88%), 88528 cellproliferation- Gaps = 0/9 (0%) Score = 224 inducing protein 53 Query 1bits (113) gi|1196104291|gb| GWQERRNKL 9 Expect = 2e− EAW90023.1GW+ERR+ L 56 hCG57025, isoform Sbjct 196 Id = 119/121 CRA_c GWEERRDIL204 (98%) gi|110611170|ref| Score = 22.7 bits (46), Expect = 100 Gaps= 0/121 NP_620688.2|ADAM Id = 6/6 (100%), Positives = 6/6 (100%), (0%)metallopeptidase Gaps = 0/6 (0%) Strand = with thrombospondin Query 4Plus/Minus type 1 motif, 17 ERRNKL 9 preproprotein ERRNKLgi|74723116|sp| Sbjct 25 Q70EL4|UBP4 ERRNKL 30 3_HUMAN Score 22.7 bits(46), Expect = 100 Ubiquitin carboxyl- Id = 7/11 (63%), Positives = 9/11(81%), terminal hydrolase Gaps = 2/11 (18%) 43 (Ubiquitin Query 2thioesterase 43) WQERRN--KLT 10 gi|119582112|gb| W+ERRN  +LT EAW61708.Sbjct 219 dynactin 4 (p62), WRERRRAIRLT 229 gi|119581530|gb| EAW61126.1polymerase (DNA directed), mu, isoform CRA_f gi|119614601|gb| EAW94195.1SMAD specific E3 ubiquitin protein ligase gi|12018151|gb| AAG45422.1| E3ubiquitin ligase SMURF2 gi|81022914|gb| ABB55266.1| rhabdomyosarcomaantigen MU-RMS-40.8 10 1_H4 gi|12001748| VEGKVARC 23 gi|51491211|emb|Score = 25.2 bits (52), Expect = 17  7 emb|AL3550 QSKSPGFEDCAH18671.1|hypo- Id = 7/10 (70%), Positives = 9/10 (90%), 52.3|CNS05TC4GLFGKF thetical Gaps = 0/10 (0%) Human chromo- proteingi| Query 8 some14 DNA 119598885|gb|EAW CQSRSPGFED 17 sequence BAC 78479.1|hCG2040711CQSKSPG ++ R-1070A8 of gi|119605408|gb| Sbjct 409 library RPC1-EAW85002.1 CQSKSPGLDN 418 11 from chro- coagulation factor Score = 24.8bits (51), Expect = 22 mosome 14 of XII Id = 9/16 (56%), Positives= 9/16 (56%), Homo sapiens gi|116242624|sp| Gaps = 5/16 (31%) (Human),com- Q12852|M3K Query 4 plete se- 12_HUMAN KVARCQSKSPGFEDGL 19 quenceMitogen-activated KV RC     GFED L Length = protein kinase Sbjct 69179712 kinase kinase 12 KVGRC-----GFEDNL 79 Score = 549 (Mixed lineageScore = 24.0 bits (49), Expect = 40 bits (277) kinase) Id = 7/9 (77%),Positives = 7/9 (77%), Expect = 6e− gi|561543|gb| Gaps = 0/9 (0%) 154AAA67343.1| Query 5 Id = 277/277 serine/threonine VARCQSKSP 13 (100%)protein kinase VARCQ K P Gaps = 0/277 gi|4033480|sp| Sbjct 46 (0%)Q13595|TRA2A VARCQCKGP 54 Strand = _HUMAN Plus/Minus Transformer-2protein homolog (TRA-2 alpha) 11 10_B LTSKTQ 8 gi|17368711|sp| Score= 23.1 bits (47), Expect = 70  0 RK Q9BZE4|NOG Identities = 8/9 (88%),Positives = 8/9 1_HUMAN (88%), Gaps = 1/9 (11%) Nucleolar GTP- Query 7binding LT-SKTQRK 14 protein 1 (Chronic LT SKTQRK renal failure geneSbjct 21 protein) (GTP- LTLSKTQRK 29 binding protein Score = 23.1 bits(47), Expect = 43 NGB) Id = 8/9 (88%), Positives = 8/9 (88%),gi|3153873|gb| Gaps = 1/9 (11%) AAC24364.1| Query 1 putative G-bindingLT-SKTQRK 8 protein LT SKTQRK gi|27923746|sp| Sbjct 5 Q96P48|CENLTLSKTQRK 13 D2_HUMAN Score = 23.1 bits (47), Expect = 43 Centaurindelta 2 Id = 8/9 (88%), Positives = 8/9 (88%), (Cnt-d2) Gaps = 1/9 (11%)gi|51464006|ref| Query 1 XP_497909.1| LT-SKTQRK 8 similar to dual LTSKTQRK specificity phos- Sbjct 19 phatase 5; VH1-like LTLSKTQRK 27phosphatase 3; Score = 21.4 bits (43), Expect = 228 serine/threonineIdentities = 6/6 (100%), Positives = 6/6 specific protein (100%)phosphatase Gaps = 0/6 (0%) gi|62089088|dbj| Query 8 BAD92988.1| TSKTQR13 Rho GTPase activat- TSKTQR ing protein 5 Sbjct 684 variant TSKTQR 689gi|55859653|emb| Score = 21.4 bits (43), Expect = 140 CA110958.1| Id= 6/6 (100%), Positives = 6/6 (100%), nuclear receptor Gaps = 0/6 (0%)subfamily 5, group Query 2 A, member 1 TSKTQR 7 gi|339550|gb| TSKTQRAAA50404.1| Sbjct 119 transforming growth TSKTQR 124 factor-beta-2 Score= 19.7 bits (39), Expect = 452 precursor Id = 6/7 (85%), Positives = 7/7(100%), gi|62896709|dbj| Gaps = 0/7 (0%) BAD96295.1| Query 2 TATAbinding TSKTQRK 8 protein interacting TSKT+RK protein 49 kDa Sbjct 820variant TSKTKRK 826 Score = 18.9 bits (37), Expect = 814 Id = 5/5(100%), Positives = 5/5 (100%), Gaps = 0/5 (0%) Query 4 KTQRK 8 KTQRKSbjct 59 KTQRK 63 11 10_C gi|55734203| GDPNSS 6 gi|119588281|gb| Score= 18.5 bits (36), Expect = 1148  3 1 emb|CR8614 EAW67875.1 Id = 5/5(100%), Positives = 5/5 (100%), 77.1|Human protein tyrosine Gaps = 0/5(0%) DNA sequence phosphatase, Query 1 from clone receptor typeJ GDPNS 5XX-NCIH2171_(—) gi|11961566-|gb| GDPNS 4G20, com- EAW95254.1 Sbjct 177plete cytoplasmic linker GDPNS 181 sequence associated protein Score= 18.5 bits (36), Expect = 1148 Length = 1, isoform Id = 5/5 (100%),Positives = 5/5 (100%), 121004 gi|119607396|gb| Gaps = 0/5 (0%) Score= 577 EAW86990.1 Query 2 bits (291) telomeric repeat DPNSS 6 Expect= 2e− binding factor DPNSS 162 (NIMA-interacting) Sbjct 2031 Id= 315/315 1 DPNSS 2035 (100%) gi|119613827|gb| Gaps = 0/315 EAW93421.1(0%) TNF reccptor- Strand = associated factor Plus/Plus 5, isoform CRA_agi|119597101|gb| EAW76695.1 transformation| transcriptiondomain-associated protein, gi|119572538|gb| EAW52153.1 protease, serine,36, isoform gi|116242829|sp| Q9Y4A5|TR RAP_HUMAN Transformation|transcription do- main-associated protein (350/400 kDa PCAF- associatedfactor) gi|119597105|gb| EAW76699.1 transformation| transcription domain-associated protein, isoform CRA_g 11 9_C6 gi|18873821| GDPNSVY 7gi|62898359|dbj| Score = 21.0 bits (42), Expect = 228  6 gb|AC01027BAD97119.1| Id = 6/7 (85%), Positives = 6/7 (85%), 9.5|Homo guaninemonophos- Gaps = 0/7 (0%) sapiens phate synthetase Query = 1 chromosome5 variant GDPNSVY 7 clone CTC- G PNSVY 533D18, com- Sbjct 171 plete se-GGPNSVY 177 quence Length = 125746 Score = 109 bits (55) Expect = 1e− 21Id = 63/67 (94%) Gaps = 0/67 (0%) Strand = Plus/Minus 11 6_G3gi|244d18207| YVYRLPIRS 26 gi|4758412|ref| Score = 29.1 bits (61),Expect = 1.2  9 gb|AC08739 LTGGAGGG NP_004472.1| Id = 11/15 (73%),Positives = 12/15 (80%), 2.10|Homo GRQEAWM Polypeptide N- Gaps = 1/15(6%) sapiens GT acetylgalactosa- Query 10 chromosome minyltransferase 2LTGGAGGG-CRQEAW 23 17, clone Growth/differentia- L GGAGGG GR+E WRP11-676J12, tion factor-11 Ras Sbjct 31 complete se- and Rab interactorLAGGAGGGAGRKEDW 45 quence 3 Length = 196772 Score = 1051 bits (530)Expect = 0.0 Id = 534/536 (99%) Gaps = 0/536 (0%) Strand = Plus/Minus 1212_C gi|22002645| VWAA 4 gi|31419318|gb| Score = 16.8 bits (32), Expect= 1771  0 12 emb|AL1588 AAH52995.1| Identities = 4/4 (100%), Positives= 4/4 27.27|Human SAPS2 protein (100%), DNA sequence gi|32810412|gb|Gaps = 0/4 (0%) from clone AAO65542.1| Query 1 RP11-330M2 onUDP-glucuronosyl- VWAA 4 chromosome 9, transferase 2B7 VWAA complete se-gi|13699834|ref| Sbjct 684 quence NP_085080.1| VWAA 687 Length =matrilin 4 186108 isoform 2 Score = 531 gi|55962492|emb| bits (268)CAI17303.1| Expect = 1e− receptor tyrosine 148 kinase-like orphan Id= 272/274 receptor 2 (99%) gi|57242755|ref| Gaps = 0/274 NP_055759.3|(0%) calsyntenin 1 Strand = isoform 2 Plus/Plus gi|45767643|gb|AAH67427.1| Cytochrome P450, family 1, sub- family A, poly- peptide 2ATP- binding cassette Solute carrier family 26, member 11 12 12_Dgi|37694066| CTNGILLK 10 gi|29568109|ref| Score = 24.4 bits (50), Expect= 22  3 2 ref|NM_1979 KI NP_055520.2| Id = 7/8 (87%), Positives = 7/8(87%), 55.1|Homo dedicator of Gaps = 0/8 (0%) sapiens cytokinesis 4Query 3 chromosome 15 gi|29335973|gb| NGILLKKI 10 open readingAAO73565.1| N ILLKKI frame 48 DOCK4 Sbjct 1132 (C15orf48),gi|18089123|gb| NSILLKKI 1139 transcript AAH20733.1| Score = 23.1 bits(47), Expect = 54 variant 1, Sushi-repeat- Id = 6/7 (85%), Positives= 7/7 (100%), mRNA containing protein, Gaps = 0/7 (0%) Length = X-linked2 Query 1 815 gi|22001626|sp| CTNGILL 7 Score = 214 Q9UKD1|GM CTNG+LLbits (108) EB2_HUMAN Sbjct 135 Expect = 1e− Glucocorticoid mod- CTNGVLL141 53 ulatory element- Score = 22.3 bits (45), Expect = 98 Id = 115/118binding protein 2 Id = 6/8 (75%), Positives = 8/8 (100%), (97%) (GMEB-2)Gaps = 0/8 (0%) Gaps = 0/118 (Parvovirus Query 3 (0%) initiation factorNGILLKKI 10 Strand = p79)(PIF p79) NGI+L+KI Plus/Minus (DNA-bindingSbjct 149 protein p79PIF) NGIMLRKI 156 516.1| Score = 21.8 bits (44),Expect = 131 nucleic acid hel- Id = 6/9 (66%), Positives = 8/9 (88%),icase DDXx Gaps = 0/9 (0%) gi|62897301|dbj| Query 1 BAD96591.1|CTNGILLKK 9 Fanconi anemia, CT G+LL+K complemeatation Sbjct 678 group Cvariant CTTGVLLRK 686 gi|16506130|dbj| BAB70696.1| phosphatidylino-sitol 3-kinase- related protein kinase gi|62087846|dbj| BAD92370.1|ataxia telangi- ectasia mutated protein isoform 1 variantgi|32483361|ref| NP_863658.1| apoptotic protease activating factorisoform d 12 8_G8 gi|11544447| HSHISNRKT 30 gi|31874131|emb| Score= 28.6 bits (60), Expect = 1.1  6 emb|AL1393 TNGYLEVA CAD97974.1| Id= 8/10 (80%), Pos = 10/10 (100%), 49.36|Human PTWKGKAG hypothetical Gaps= 0/10 (0%) DNA sequence QGFGH protein Query 20 from clonegi|15214067|sp| WKGKAGQGFG 29 RP11-261P9 O95405|ZFYV W+G+AGQGFG onchromosome 9_HUMAN Sbjct 5 20 (Receptor activa- WRGRAGQGFG 14 Length =tion anchor) core = 26.5 bits (55), Expect = 4.8 68754 Serine protease-Id = 12/19 (63%), Pos = 12/19 (63%), Score = 1035 like protein Smad Gaps= 4/19 (21%) bits (522) anchor for receptor Query 4 Expect = 0.0activation ISNRK--TTNGYLEVAPTW 20 Id = 623/647 gi|55741557|ref| ISRK  TT G  EVAP W (96%) NP_055809.1 Sbjct 578 Gaps = 10/647Mitogen-activated ISARKPFTTLG--EVAPVW 594 (1%) protein kinase Score= 23.5 bits (48), Expect = 38 Strand = binding protein 1 Id = 9/16(56%), Pos = 12/16 (75%), Plus/Minus gi|182775|gb| Gaps = 2/16 (12%)AAA58487.1| Query 7 v-fos transforma- RKTTNGY-LEVAPTWK 21 tion effectorRKTT  Y ++V P+WK protein Sbjct 609 Mothers against RKTTL-YDMDVEPSWK 623decapentaplegic homolog interacting protein (Madh- interacting pro-tein) Novel serine protease Adapter related protein complex 3 beta 1subunit Dickkopf- like protein 1 precursor 12 9_H7 gi|23268261| YRLMEEN7 gi|119592213|gb| Score = 24.4 bits (50), Expect = 22  7 gb|AC12978EAW71807.1 Id = 6/6 (100%), Positives = 6/6 (100%), 2.3|Homo zonapellucida Gaps = 0/6 (0%) sapiens BAC glycoprotein 3 Query 2 cloneRP111- (sperm receptor), RLMEEN 7 28O7 from isoform CRA_a RLMEEN UL,complete gi|113415975|ref| Sbjct 124 sequence XP_0011291 RLMEEN 129Length = 78.1|PREDICTED: Score = 24.4 bits (50), Expect = 22 66860similar to multiple Id = 7/8 (87%), Positives = 7/8 (87%), Score = 172coiled-coil Gaps = 1/8 (12%) bits (87) GABABR1-binding Query 1 Expect= 3e− protein YRL-MEEN 7 41 gi|28436730|gb| YRL MEEN Id = 92/94AAH47075.1| Sbjct 564 (97%) Janus kinase and YRLEMEEN 571 Gaps = 0/94microtubule inter- Score = 18.5 bits (36), Expect = 1331 (0%) actingprotein 1 Id = 4/5 (80%), Positives = 5/5 (100%), Strand =gi|38641276|gb| Gaps = 0/5 (0%) Plus/Plus AAR26235.1| Query 2 MARLIN1RLMEE 6 gi|119602807|gb| RLM+E EAW82401.1 Sbjct 207 janus kinase andRLMDE 211 microtubule inter- Score = 24.4 bits (50), Expect = 22 actingprotein 1, Id = 7/8 (87%), Positives = 7/8 (87%), isoform CRA_a Gaps= 1/8 (12%) gi|5817226|emb| Query 1 CAB53703.1| YRL-MEEN 7 hypotheticalYRL MEEN protein Sbjct 571 gi|11995070|dbj| YRLEMEEN 578 BAB20049.1|Score = 18.5 bits (36), Expect = 1331 calmodulin- Id = 4/5 (80%),Positives = 5/5 (100%), dependent Gaps = 0/5 (0%) phosphodiesteraseQuery 2 gi|16151615|emb| RLMEE 6 CAC82208.1 RLM+E 3′5′-cyclic Sbjct 207nucleotide RLMDE 211 phosphodiesterase 1A5 gi|119631376|gb| EAX10971.|1phosphodiesterase 1A, calmodulin- dependent, isoform CRA_ggi|1705942|sp| P54750|PDE1A _HUMAN Calcium/calmodu- lin-dependent 3′,5′-cyclic nucleo- tide phosphodi- esterase 1A gi|119604572|gb|EAW84166.1 SW1/SNF related, matrix associated, actin dependent regulatorof chromatin, sub- family a, member 4 gi|4056413|gb| AAC97987.1|SN24_HUMAN; nuclear protein GRB1; home- otic gene regula- tor; SNF2-BETAgi|738309|prf|| 1924378A nucler protein GRB1 gi|505088|dbj| BAA05143.1|transcriptional activator hSNF2b gi|109734817|eb| AAI17695.1| DOCK4protein gi|40254834|ref| NP_006603.2| kinesin family member 1Cgi|116242606|sp| O43896|KIF1 C_HUMAN Kinesin-like pro- tein KIF1Cgi|109734809|gb| AA117689.1| Dedicator of cytokinesis 4 12 9_D1gi|20800377| KSFKVNI 13 gi|113415614|ref| Score = 26.1 bits (54), Expect= 9.5  8 1 gb|AC11661 SLMFCK XP_0011282 Id = 9/20(45%), Positives= 10/20 (50%), 8.4|Homo 50.1|PREDICTED: Query 2 sapiens BAC hypotheticalSFKYNISL--------MFCK 13 clone RP11- protein S+ Y ISL        MFCK 98L17from gi|119611411|gb| Sbjct 7 4, complete EAW91005. SYMYQISLQQAFCTVIMFCK26 sequence astrotactin, Score = 25.7 bits (53), Expect = 13 Length =isoform CRA_a Id = 7/8 (87%), Positives = 7/8 (87%), 153040gi|119602676|gb| Gaps = 1/8 (12%) Score = 220 EAW82270.1 Query 5 bits(111) hCG1995022 YNISLMFC 12 Expect = 2e− gi|20270245|ref| YNI LMFC 55NP_612474.1| Sbjct 686 Id = 113/114 GLI-Kruppel family YNI-LMFC 692(99%) member GLI4 Score = 24.8 bits (51), Expect = 23 Gaps = 0/114gi|33302619|sp| Id = 8/10 (80%), Positives = 9/10 (90%) (0%) P10075|GLI4Gaps = 1/10 (10%) Strand = HUMAN Query 1 Plus/Minus Zinc fingerKSFKYNISLM 10 protein GLI4 KSFKYN SL+ (krueppel-related Sbjct 190 zincfinger protein KSFKYN-SLL 198 4) (Protein HKR4) gi|119576919|gb|EAW56515.1 CTTNBP2 N-terminal like, gi|119570340|gb| EAW49955.1 RhoGTPase activa- ting protein 19, isoform CRA_d 12 6_C3 none GISTLK 6gi|119574273|gb| Score = 20.6 bits (41), Expect = 426  9 EAW53888.1 Id= 6/6 (100%), Positives = 6/6 (100%), hCG2040542 Gaps = 0/6 (0%)gi|119628905|gb| Query 1 EAX08500.1| GISTLK 6 breast cancer 2, GISTLKearly onset, Sbjct 170 isoform CRA_b GISTLK 175 gi|55957538|emb| Score= 19.3 bits (38), Expect = 1030 CA113195.1| Id = 6/10 (60%), Positives= 6/10 (60%), BRCA2 Gaps = 0/10 (0%) gi|14424438|sp| Query 2 P51587|BRCAISTLKXXXXK 11 2_HUMAN ISTLK    K Breast cancer type Sbjct 580 2susceptibility ISTLKKKTNK 589 protein (Fanconi Score = 18.9 bits (37),Expect = 1381 anemia group D1 Id = 6/10 (60%), Positives = 6/10 (60%),protein) Gaps = 0/10 (0%) gi|42793995|gb| Query 3 AAH66592.1| STLKXXXXKL12 CUTL1 protein STLK    KL gi|119570613|gb| Sbjct 336 EAW50228.1STLKQLEEKL 345 cut-like 1, CCAAT displacement pro- tein (Drosophila),isoform gi|44887461|gb| AAS48058.1| T cell antigen re- ceptor beta chaingi|1552504|gb| AAC80198.1| V segment transla- tion product 13 10_Dgi|15451718| GDPNS 5 gi|119615660|gb| Score = 18.5 bits (36), Expect= 957  0 3 gb|AC02297 EAW95254.1 Id = 5/5 (100%), Positives = 5/5(100%), 3.5|Homo cytoplasmic linker Gaps = 0/5 (0%) sapiens, associatedprotein Query 1 clone 1, isoform CRA_c GDPNS 5 RP11-473O4,gi|119607399|gb| GDPNS complete se- EAW86993.1 Sbjct 216 quencetelomeric repeat GDPNS 220 Length = binding factor 194367(NIMA-interacting) Score = 180 1, isoform CRA_a bits (91)gi|119612111|gb| Expect = 2e− EAW91705.1 43 protein phosphatase Id= 91/91 2C, magnesium- (100%) dependent, cata- Gaps = 0/91 lyticsubunit, (0%) isoform Strand = gi|119588281|gb| Plus/Minus EAW67875.1protein tyrosine phosphatase, re- ceptor type, J, isoform CRA_cgi|11342O339|ref |XP_944412. 2|PREDICTED: similar to growth inhibitionand differentiation related protein 86  3 10_C gi|19807899| EMKRHIST 37gi|585487|sp| Score = 27.8 bits (58), Expect = 5.1 3 gb|AC11076 LRWKTCLNQ07325|SCYB9_(—) Id = 14/28 (50%), Pos = 17/28 (60%), 9.2|Homo ANMKELLEHUMAN Gaps = 11/28 (39%) sapiens BAC IKVTGKIRY Small inducible Query 12clone RPII- NQGL cytokine B9 pre- ISTLRWK----TCLN---ANMKELLEIK 32 141B14cursor (CXCL9) I+TL  K    TCLN   A++ KEL IK from 2, com- Gammainterferon Sbjct 64 plete se- induced monokineIATL--KNGVQTCLNPDSADVKEL--IK 87 quence (MIG) (27.8) Score = 26.1 bits(54), Expect = 9.0 Length = gi|62898822|dbj| Id = 9/13 (69%), Pos= 12/13 (92%), 135317 BAD97265.1 Gaps = 1/13 (7%) Score = 599Serologically de- Query 19 bits (302) fined colon cancer MKELLEIKVTGKI31 Expect = 2e− antigen 33 variant M+EL+E KVTGK+ 168 Antigen NY-CO-33Sbjct 270 Id = 312/317 (26.1) MEELVE-KVTGKV 281 (98%) gi|31077164|sp|Score = 24.0 bits (49), Expect = 39 Gaps = 0/317 O95757/HS74 Id = 6/6(100%), Pos = 6/6 (100%), (0%) L_HUMAN Gaps = 0/6 (0%) Strand = Heatshock 70 kDa Query 8 Plus/Plus protein 4L TLRWKT 13 gi|21315086|gb|TLRWKT AAH30792.1| Sbjct 403 Cyclin-dependent TLRWKT 408 kinase 5,regulatory subunit 1 Transient receptor potential cation channelsubfamily V member 3 Vanilloid receptor-like 3 (VRL-3) Dehydrodolichyldiphosphatc syn- thase Tyrosyl-tRNA synthetase Putative mitochon- drialouter mem- brane protein im- port receptor  5 10_G gi|21686938|CINMDSPPK 11 gi|57208724|emb| Score = 24.8 bits (51), Expect = 338 8gb|AC11605 QC CA142568.1 Id = 6/7 (85%), Pos = 7/7 (100%), 0.3|Homo GNAScomplex locus Gaps = 0/7 (0%) sapiens BAC (OTTHUMP0000003173 Query 2clone RP11- 7) INMDSPP 8 427F2 from Guanine nucleotide +NMDSPP 2,complete binding protein, Sbjct 195 sequence alpha stimulating VNMDSPP201 Length = activity polypep- Score = 22.3 bits (45), Expect = 131165225 tide 1 Id = 7/9 (77%), Pos = 7/9 (77%), Score = 343gi|68565390|sp| Gaps = 2/9 (22%) bits (173) Q14676|MDC1_HUMAN Query 4Expect = 3e− Mediator of DNA MDSPP--KQ 10 92 damage checkpoint MDSPP  KQId = 179/182 protein 1 (Nuclear Sbjct 1818 (98%) factor with BRCTMDSPPHQKQ 1826 Gaps = 0/182 domains 1) Score 22.3 bits (45), Expect= 131 (0%) gi|50400452|sp| Id = 8/12 (66%), Pos = 8/12 (66%), Strand =Q7Z569/BRAP Gaps = 2/12 (16%) Plus/Plus _HUMANBRCA1-associ- Query 1 atedprotein CINM--DSPPKQ 10 (Impedes mitogenic CIN   DSP KQ signalpropagation) Sbjct 110 (IMP) CINAAPDSPSKQ 121 gi|739072|prf|| 2002263AE1A-assoc protein gp130 Retinoblastoma-like protein 2 gi|55957867|emb|CA113220.1 E74-like factor 1 (ets domain trans- cription factor) RUN andSH3 domain containing protein 2 Ezrin-radixin- moesin bindingphosphoprotein 50 Impedes mitogenic signal propagation (IMP)Membrane-associated nucleic acid bind- ing protein E1A-associatedprotein gp130 13 6_C8 gi|21263318| GAGWEWV 7 gi|18089035|gb| Score= 23.1 bits (47), Expect = 38 gb|AC10444 AAH20586.1| Id = 5/5 (100%),Pos = 5/5 (100%), 1.2|Homo SERS14 protein Gaps = 0/5 (0%) sapiensgi|49256613|gb| Query 3 chromosome 3 AAH73912.1| GWEWV 7 clone RP11-ACCN4 protein GWENV 901H12, Regenerating islet Sbjct 946 complete se-derived protein 3 GWEWV 950 quence alpha precursor Score = 22.7 bits(46), Expect = 50 Length = Pancreatitis Id = 5/5 (100%), Positives = 5/5(100%), 2177320 associated protein Gaps = 0/5 (0%) Score = 262 1 (PAP)(21.8) Query 2 bits (132) ELAM-1 ligand AGWEW 6 Expect = 4e−fucosyltransferase AGWEW 67 Mitochondrial ri- Sbjct 255 Id = 160/171bosomal protein AGWEW 259 (93%) bMRP64 Gaps = 1/171 (0%) Strand =Plus/Minus 14 6_D4 gi|15004913| PLCLASLLS 57 gi|57209194|emb| Score= 32.9 bits (70), Expect = 0.19 gb|AC00947 FIVCLFHFR CAI41407.1 Id = 8/9(88%), Positives = 9/9 (100%), 5.5|Homo YLPTILLPP Dedicator of Gaps= 0/9 (0%) sapiens BAC I cytokinesis 11 Query 12 clone RP11- LKHKCNDRDOCK11 protein VCLFHFRYL 20 285F23 from MHLTCFGS Cdc42-associatedVCLFHFRY+ 2, complete AKALMYSL guanine nucleotide Sbjct 1319 sequenceSNNRC exchange factor VCLFHFRYM 1327 Score = 835 ACG|DOCK11 Score = 30.3bits (64), Expect = 1.7 bits(421) gi|13634012|sp| Id = 14/44 (31%), Pos= 20/44 (45%), Expect = 0.0 Q15884|C161_(—) Gaps = 21/44 (47%) Id= 424/425 HUMAN 6 (99%) Protein C9orf61SPLCLA-------SLLSFIVCLFHFR--------------YLPT Gaps = 0/425 (Protein X123)28 (0%) Probable G-protein S+ C+A       S+LSF+VC F +R              +LPTStrand = coupled receptor 37 Plus/Minus 113 precursorSRMCMAISICQMLSMLSFVVCAFRYRHMFKRGWPMGTCCLFLPT (G-protein coupled 80receptor PGR23) Wingless-type MMTV integration site family 20 10_Bgi|21747795| NSFHN 5 gi|119626686|gb| Score = 20.2 bits (40), Expect= 295 11 gb|AC12486 EAX06281.1| Id = 5/5 (100%), Positives = 5/5 (100%),4.3|Homo hCG21296, isoform Gaps = 0/5 (0%) sapiens BAC CRA_c Query 1clone RPd11- gi|119615394|gb| NSFHN 5 570J4 from 4, EAW94988.1 NSFHNcomplete se- ubiqititin specific Sbjct 310 quence peptidase NSFHN 314Length = 48, isoform CRA_b 166797 gi|119584494|gb| Score = EAW64090.1224 bits solute carrier (113) family 6 Expect = 4e− (neurotransmitter 56transporter, GABA), Id = 113/113 member 1, isoform (100%) CRA_a Gaps= 0/113 (0%) Strand = Plus/Plus 25 9_C4 gi|22657585| GITGSRPA 12gi|119623989|gb| Score 29.9 bits (63), Expect = 0.68 gb|AC09156 WPTWEAX03584.1| Id = 7/7 (100%), Positives = 7/7 (100%), 4.12|Homo cAMPresponsive Gaps = 0/7 (0%) sapiens element binding Query 6 chromosomeprotein-like 1 RPAWPTW 12 11, clone gi|14250004|gb| RPAWPTW RP11-732A19,AAH08394.1| Sbjct 312 complete se- CREBL1 protein RPAWPTW 318 quencegi|119625915|gb| Length = EAX05510.1| 211735 homeodomain-only Score= 317 protein bits (160) gi|24286115|gb| Expect = 4e− AAN46678.1|RPAWPTW 84 hyypothetical Id = 193/204 protein (94%) HGRHSVI Gaps = 3/204(1%) Strand = Plus/Plus

REFERENCES

-   1. Alizadeh A A, et al. Distinct types of diffuse large B-cell    lymphoma identified by gene expression profiling. Nature    403:503-511, (2000).-   2. An, A, et al. A learning system for more accurate    classifications. Lecture Notes in Artificial Intelligence,    Vancouver. 1418:426-441, (1998).-   3. Aunoble B, et al. Major oncogenes and tumor suppressor genes    involved in epithelial ovarian cancer. Int J Oncol 16:567-76,    (2000).-   4. Baron A T, et al. Serum sErbB1 and Epidermal Growth Factor Levels    As Tumor Biomarkers in Women with Stage III or IV Epithelial Ovarian    Cancer Epidemiology. Biomarkers & Prevention 8:129-137, 1999.-   5. Bauer R, et al. Cloning and characterization of the Drosophila    homologue of the AP-2 transcription factor. Oncogene 17:1911-1922    (1998).-   6. Bast R C, et al. Reactivity of a monoclonal antibody with human    ovarian carcinoma. J. Clin Invest 68:1331-1337 (1981).-   7. Bast R C et al. A radioimmunoassay using a monoclonal antibody to    monitor the course of epithelial ovarian cancer. N Engl J Med 309:    883-887 (1983).-   8. Berek, J S et al. Serum interleukins-6 levels correlate with    disease status in patients with epithelial ovarian cancer. Am J    Obstet Gynecol 164: 1038-1043 (1991).-   9. Bittner, M et al. Molecular Classification of Cutaneous Malignant    Melanoma by Gene Expression Profiling. Nature 406:536-540 (2000).-   10. Blake C, et al. UCI respitory of machine learning databases    (1998).-   11. Boyd J, et al. Molecular genetic and clinical implications    [Review]. Gynecol Oncol 64:196-206 (1997).-   12. Breiman L, et al. Classification and regression trees, Wadsworth    and Brooks (1984).-   13. Buettner R, et al. An alternatively spliced form of AP-2 encodes    a negative regulator of transcriptional activation by AP-2. Mol.    Cell. Biol 13:4174-4185 (1993).-   14. Chiao P J, et al. Elevated expression of the human ribosomal S2    gene in human tumors. Molecular Carcinogenesis 5:219-231 (1992).-   15. Clark P, et al. The CN2 induction algorithm. Machine Learning    3:261-283 (1989).-   16. Coleman M P, et al. Trends in cancer incidence and mortality.    Lyon, France: IARC Scientific Publications 121:477-498 (1993).-   17. Deyo J, et al. A novel protein expressed at high cell density    but not during growth arrest. DNA and Cell Biol 17:437-447 (1998).-   18. Draghici S. The Constraint Based Decomposition, accepted for    publication in Neural Networks, to appear (2001).-   19. Einhorn, N. et al. Prospective evaluation of serum CA 125 levels    for early detection of ovarian cancer. Obstet Gynecol 80:14-18    (1992).-   20. Golub T R, et al. Molecular classification of cancer: class    discovery and class prediction by gene expression monitoring.    Science 286:531-537 (1999).-   21. Gotlieb W H, et al. Presence of interleukins in the ascites of    patients with ovarian and other intrabdominal cancers. Cytokine    4:385-390 (1992).-   22. Greenlee R T, et al. Cancer Statistics. CA Cancer J Clin 50:7-33    (2000).-   23. Heath, S. et al. Induction of oblique decision tree. In    IJCAI-93. Washington, D.C. (1993).-   24. Hogdall E V, et al. Predictive values of serum tumour markers    tetranectin, OVX1, CASA and CA125 in patients with a pelvic mass.    Int J serum tumour markers tectranectin, OVX1, CASA and CA125 in    patients with a pelvic mass. Int J Cancer 89:519-523 (2000).-   25. Holschneider C H, et al. Ovarian cancer: epidemiology, biology,    and prognostic factors. Semin Surg Oncol 1:3-10 (2000).-   26. Houts T M: Improved 2-Color Normalization For Microarray    Analyses Employing Cyanine Dyes, CAMDA (2000). Critical Assessment    of Techniques for Microarray Data Mining. Duke University Medical    Center, Dec. 18-19 (2000).-   27. Jacobs I J, et al. Potential screening tests for ovarian cancer,    in Sharp F, Mason W P, Leake R E (eds). Ovarian Cancer. London,    Chapman and Hall Medical, 197-205 (1997).-   28. Jacobs, I. Et al. Multimodal approach to screening for ovarian    cancer. Lancet 1268-271 (1988).-   29. Jacobs I, et al. The CA 125 tumor-associated antigen: a review    of the literature. Hum Reprod 4:1-12 (1989).-   30. Kacinski B M et al. Macrophage colony-stimulating factor is    produced by human ovarian and endometrial adenocarcinoma-derived    cell lines and is present at abnormally high levels in the plasma of    ovarian carcinoma patients with active disease. Cancer Cells    7:333-337 (1989).-   31. Kerr, Martin, Churchill. Analysis of variance for gene    expression microarray data. Journal of Computational Biology (2000).-   32. Kim, S Y et al. Coordinate Control of Growth and Cytokeratin 13    Expression by Retinoic Acid. Molecular Carcinogenesis 16:6-11    (1996).-   33. Kohonen T. Learning vector quantization. Neural Networks, 1    (suppl. 1):303 (1988).-   34. Kohonen T. Learning vector quantization. In the handbook of    brain theory and neural networks pp. 537-540. Cambridge Mass.: MIT    press (1995).-   35. MacBeath G. et al. Printing proteins as microarrays for    high-throughput function determination. Science 289:1760-3 (2000).-   36. Murthy K. On growing better decision trees from data.    Unpublished doctoral dissertation. John Hopkins University (1995).-   37. Musavi M. et al. On the training of radial basis functions    classifiers. Neural Networks 5:595-603 (1992).-   38. Patsner B. et al. Comparison of serum CA 125 and lipid    associated sialic acid (LASA-P) in monitoring patients with invasive    ovarian adenocarcinoma. Gynecol Oncol 30(1): 98-103 (1988).-   39. Peng Y S, et al. ARHI is the center of allelic deletion on    chromosome lp31 in ovarian and breast cancers. Int J Cancer 86:690-4    (2000).-   40. Precup D, et al. Classification using $/Phi$-machines and    constructive function approximation. In Proc. 15th International    Conf. On Machine Learning, pages 439-444. Morgan Kaufmann, San    Francisco, Calif. (1998).-   41. Poggio T, et al. Networks for approximation and learning.    Proceedings of IEEE 78(9):1481-149 (1990).-   42. Quinlan J R: C4.5: Programs for machine learning,    Morgan-Kaufmann (1993).-   43. Rumelhart, D E, et al. Learning internal representations by    error backpropagation. Parallel Distributed Processing: Explorations    in the Microstructures of Cognition, MIT Press/Bradford Books    (1986).-   44. Schwartz P E, et al. Circulating tumor markers in the monitoring    of gynecologic malignancies. Cancer 60:353-361 (1987).-   45. Schmittgen T D et al. Quantitative reverse    transcription-polymerase chain reaction to study mRNA decay:    comparison of endpoint and real-time methods. Anal Biochem,    285:194-204 (2000).-   46. Sonoda K, Nakashima M, Kaku T, Kamura T, Nakano H, Watanabe T. A    novel tumor-associated antigen expressed in human uterine and    ovarian carcinomas. Cancer 1996 77:1501-9,-   47. Nakashima M, Sonoda K, Watanabe T. Inhibition of cell growth and    induction of apoptotic cell death by the human tumor-associated    antigen RCAS1. Nat. Med. 1999 5:938-42.-   48. Lindstrom M S, Klangby U, Wiman K G. p14ARF homozygous deletion    or MDM2 overexpression in Burkitt lymphoma lines carrying wild type    p53. Oncogene. 20(17):2171-7, 2001.

What is claimed is:
 1. A diagnostic device for use in detecting thepresence of head and neck squamous cell carcinoma (HNSCC) in a patientcomprising an immunoassay incorporating at least one polypeptide markerantigen for HNSCC, for detecting a presence of at least one antibody ina patient's serum indicative of HNSCC and reactive with at least one ofthe following polypeptide marker antigens for HNSCC, each of saidpolypeptide marker antigens comprising a SEQ ID NO. selected from SEQ IDNO:3, SEQ ID NO: 7, SEQ ID NO: 11, SEQ ID NO: 15, SEQ ID NO: 19, SEQ IDNO: 23, SEQ ID NO: 27, SEQ ID NO: 31, SEQ ID NO: 35, SEQ ID NO: 45, SEQID NO: 59, SEQ ID NO: 69, SEQ ID NO: 79, SEQ ID NO: 98, SEQ ID NO: 102,SEQ ID NO: 112, SEQ ID NO: 122, SEQ ID NO: 136, SEQ ID NO: 146, SEQ IDNO: 156, SEQ ID NO: 164, SEQ ID NO: 168, SEQ ID NO: 178, SEQ ID NO: 188,SEQ ID NO: 198, SEQ ID NO: 208, SEQ ID NO: 218, SEQ ID NO: 228, SEQ IDNO: 235, SEQ ID NO: 245, SEQ ID NO: 255, SEQ ID NO: 265, SEQ ID NO: 275,SEQ ID NO: 292, SEQ ID NO: 305, SEQ ID NO: 327, SEQ ID NO: 337, SEQ IDNO: 347, SEQ ID NO: 351, SEQ ID NO: 361, SEQ ID NO: 375, SEQ ID NO: 385,SEQ ID NO: 389, SEQ ID NO: 399, SEQ ID NO: 422, SEQ ID NO: 432, SEQ IDNO: 442, SEQ ID NO: 452, SEQ ID NO: 456, SEQ ID NO: 460, SEQ ID NO: 473,SEQ ID NO: 483, SEQ ID NO: 490, SEQ ID NO: 497, SEQ ID NO: 507, SEQ IDNO: 517, SEQ ID NO: 527, SEQ ID NO: 537, SEQ ID NO: 547, SEQ ID NO: 554,SEQ ID NO: 564, SEQ ID NO: 574, SEQ ID NO: 578, SEQ ID NO: 600, SEQ IDNO: 607, SEQ ID NO: 614, SEQ ID NO: 625, SEQ ID NO: 632, SEQ ID NO: 642,SEQ ID NO: 646, SEQ ID NO: 653, SEQ ID NO: 663, SEQ ID NO: 670, SEQ IDNO: 683, SEQ ID NO: 687, SEQ ID NO: 698, SEQ ID NO: 705, SEQ ID NO: 712,SEQ ID NO: 719, SEQ ID NO: 743, SEQ ID NO: 753, SEQ ID NO: 763, SEQ IDNO: 770, SEQ ID NO: 781, SEQ ID NO: 785, SEQ ID NO: 795, SEQ ID NO: 802,SEQ ID NO: 812, SEQ ID NO: 822, SEQ ID NO: 843, SEQ ID NO: 852, SEQ IDNO: 859, SEQ ID NO: 869, SEQ ID NO: 882, SEQ ID NO: 892, SEQ ID NO: 896,SEQ ID NO: 909, SEQ ID NO: 919, SEQ ID NO: 938, SEQ ID NO: 948, SEQ IDNO: 966, SEQ ID NO: 976, SEQ ID NO: 986, SEQ ID NO: 996, SEQ ID NO:1016, SEQ ID NO: 1029, SEQ ID NO: 1039, SEQ ID NO: 1072, SEQ ID NO:1080, SEQ ID NO: 1093, SEQ ID NO: 1103, or SEQ ID NO: 1119, orconsisting of a SEQ ID NO. selected from SEQ ID NO: 55, SEQ ID NO: 371,SEQ ID NO: 621, SEQ ID NO: 694, SEQ ID NO: 962, SEQ ID NO: 777, SEQ IDNO: 1129, or SEQ ID NO: 1076; each of said polypeptides being expressedas an insert in frame with the T7 10B phage capsid gene product as afusion polypeptide.
 2. The diagnostic device of claim 1, wherein saidimmunoassay is selected from the group consisting of a microarrayimmunoassay, a macroarray immunoassay, a bead array immunoassay, anELISA, a slide immunoassay, and a filter immunoassay.
 3. A diagnosticdevice for use in staging head and neck squamous cell carcinoma (HNSCC)in a patient comprising an immunoassay incorporating at least onepolypeptide marker antigen for HNSCC, for detecting a presence of atleast one antibody in a patient's serum indicative of HNSCC and reactivewith at least one of the following peptide markers for HNSCC, each ofsaid polypeptide marker antigens comprising a SEQ ID NO. selected fromSEQ ID NO:3, SEQ ID NO: 7, SEQ ID NO: 11, SEQ ID NO: 15, SEQ ID NO: 19,SEQ ID NO: 23, SEQ ID NO: 27, SEQ ID NO: 31, SEQ ID NO: 35, SEQ ID NO:45, SEQ ID NO: 59, SEQ ID NO: 69, SEQ ID NO: 79, SEQ ID NO: 98, SEQ IDNO: 102, SEQ ID NO: 112, SEQ ID NO: 122, SEQ ID NO: 136, SEQ ID NO: 146,SEQ ID NO: 156, SEQ ID NO: 164, SEQ ID NO: 168, SEQ ID NO: 178, SEQ IDNO: 188, SEQ ID NO: 198, SEQ ID NO: 208, SEQ ID NO: 218, SEQ ID NO: 228,SEQ ID NO: 235, SEQ ID NO: 245, SEQ ID NO: 255, SEQ ID NO: 265, SEQ IDNO: 275, SEQ ID NO: 292, SEQ ID NO: 305, SEQ ID NO: 327, SEQ ID NO: 337,SEQ ID NO: 347, SEQ ID NO: 351, SEQ ID NO: 361, SEQ ID NO: 375, SEQ IDNO: 385, SEQ ID NO: 389, SEQ ID NO: 399, SEQ ID NO: 422, SEQ ID NO: 432,SEQ ID NO: 442, SEQ ID NO: 452, SEQ ID NO: 456, SEQ ID NO: 460, SEQ IDNO: 473, SEQ ID NO: 483, SEQ ID NO: 490, SEQ ID NO: 497, SEQ ID NO: 507,SEQ ID NO: 517, SEQ ID NO: 527, SEQ ID NO: 537, SEQ ID NO: 547, SEQ IDNO: 554, SEQ ID NO: 564, SEQ ID NO: 574, SEQ ID NO: 578, SEQ ID NO: 600,SEQ ID NO: 607, SEQ ID NO: 614, SEQ ID NO: 625, SEQ ID NO: 632, SEQ IDNO: 642, SEQ ID NO: 646, SEQ ID NO: 653, SEQ ID NO: 663, SEQ ID NO: 670,SEQ ID NO: 683, SEQ ID NO: 687, SEQ ID NO: 698, SEQ ID NO: 705, SEQ IDNO: 712, SEQ ID NO: 719, SEQ ID NO: 743, SEQ ID NO: 753, SEQ ID NO: 763,SEQ ID NO: 770, SEQ ID NO: 781, SEQ ID NO: 785, SEQ ID NO: 795, SEQ IDNO: 802, SEQ ID NO: 812, SEQ ID NO: 822, SEQ ID NO: 843, SEQ ID NO: 852,SEQ ID NO: 859, SEQ ID NO: 869, SEQ ID NO: 882, SEQ ID NO: 892, SEQ IDNO: 896, SEQ ID NO: 909, SEQ ID NO: 919, SEQ ID NO: 938, SEQ ID NO: 948,SEQ ID NO: 966, SEQ ID NO: 976, SEQ ID NO: 986, SEQ ID NO: 996, SEQ IDNO: 1016, SEQ ID NO: 1029, SEQ ID NO: 1039, SEQ ID NO: 1072, SEQ ID NO:1080, SEQ ID NO: 1093, SEQ ID NO: 1103, or SEQ ID NO: 1119, orconsisting of a SEQ ID NO. selected from SEQ ID NO: 55, SEQ ID NO: 371,SEQ ID NO: 621, SEQ ID NO: 694, SEQ ID NO: 962, SEQ ID NO: 777, SEQ IDNO: 1129, or SEQ ID NO: 1076; each of said polypeptides being expressedas an insert in frame with the T7 10B phage capsid gene product as afusion polypeptide.
 4. The diagnostic device of claim 3, wherein saidimmunoassay is selected from the group consisting of a microarrayimmunoassay, a macroarray immunoassay, a bead array immunoassay, anELISA, a slide immunoassay, and a filter immunoassay.
 5. Isolatedpolypeptide marker antigens for head and neck squamous cell carcinoma,each of said polypeptide marker antigens comprising a SEQ ID NO.selected from SEQ ID NO:3, SEQ ID NO: 7, SEQ ID NO: 11, SEQ ID NO: 15,SEQ ID NO: 19, SEQ ID NO: 23, SEQ ID NO: 27, SEQ ID NO: 31, SEQ ID NO:35, SEQ ID NO: 45, SEQ ID NO: 59, SEQ ID NO: 69, SEQ ID NO: 79, SEQ IDNO: 98, SEQ ID NO: 102, SEQ ID NO: 112, SEQ ID NO: 122, SEQ ID NO: 136,SEQ ID NO: 146, SEQ ID NO: 156, SEQ ID NO: 164, SEQ ID NO: 168, SEQ IDNO: 178, SEQ ID NO: 188, SEQ ID NO: 198, SEQ ID NO: 208, SEQ ID NO: 218,SEQ ID NO: 228, SEQ ID NO: 235, SEQ ID NO: 245, SEQ ID NO: 255, SEQ IDNO: 265, SEQ ID NO: 275, SEQ ID NO: 292, SEQ ID NO: 305, SEQ ID NO: 327,SEQ ID NO: 337, SEQ ID NO: 347, SEQ ID NO: 351, SEQ ID NO: 361, SEQ IDNO: 375, SEQ ID NO: 385, SEQ ID NO: 389, SEQ ID NO: 399, SEQ ID NO: 422,SEQ ID NO: 432, SEQ ID NO: 442, SEQ ID NO: 452, SEQ ID NO: 456, SEQ IDNO: 460, SEQ ID NO: 473, SEQ ID NO: 483, SEQ ID NO: 490, SEQ ID NO: 497,SEQ ID NO: 507, SEQ ID NO: 517, SEQ ID NO: 527, SEQ ID NO: 537, SEQ IDNO: 547, SEQ ID NO: 554, SEQ ID NO: 564, SEQ ID NO: 574, SEQ ID NO: 578,SEQ ID NO: 600, SEQ ID NO: 607, SEQ ID NO: 614, SEQ ID NO: 625, SEQ IDNO: 632, SEQ ID NO: 642, SEQ ID NO: 646, SEQ ID NO: 653, SEQ ID NO: 663,SEQ ID NO: 670, SEQ ID NO: 683, SEQ ID NO: 687, SEQ ID NO: 698, SEQ IDNO: 705, SEQ ID NO: 712, SEQ ID NO: 719, SEQ ID NO: 743, SEQ ID NO: 753,SEQ ID NO: 763, SEQ ID NO: 770, SEQ ID NO: 781, SEQ ID NO: 785, SEQ IDNO: 795, SEQ ID NO: 802, SEQ ID NO: 812, SEQ ID NO: 822, SEQ ID NO: 843,SEQ ID NO: 852, SEQ ID NO: 859, SEQ ID NO: 869, SEQ ID NO: 882, SEQ IDNO: 892, SEQ ID NO: 896, SEQ ID NO: 909, SEQ ID NO: 919, SEQ ID NO: 938,SEQ ID NO: 948, SEQ ID NO: 966, SEQ ID NO: 976, SEQ ID NO: 986, SEQ IDNO: 996, SEQ ID NO: 1016, SEQ ID NO: 1029, SEQ ID NO: 1039, SEQ ID NO:1072, SEQ ID NO: 1080, SEQ ID NO: 1093, SEQ ID NO: 1103, or SEQ ID NO:1119, or consisting of a SEQ ID NO. selected from SEQ ID NO: 55, SEQ IDNO: 371, SEQ ID NO: 621, SEQ ID NO: 694, SEQ ID NO: 962, SEQ ID NO: 777,SEQ ID NO: 1129, or SEQ ID NO: 1076; each of said polypeptides beingexpressed as an insert in frame with the T7 10B phage capsid geneproduct as a fusion polypeptide.
 6. A method of diagnosing head and necksquamous cell carcinoma (HNSCC), including the steps of: detecting anantibody reactive with a polypeptide marker antigen of claim 1 in theserum of a patient indicative of the presence of HNSCC with thediagnostic device of claim 1; and diagnosing the patient with HNSCC. 7.The method of claim 6, wherein said polypeptide marker antigens arechosen from the group consisting of SEQ ID NOs: 3, 7, 11, 15, 19, 23, 27and 31, wherein each of said polypeptides are expressed as an insert inframe with the T7 10B phage capsid gene product as a fusion polypeptide.8. A method of staging head and neck squamous cell carcinoma (HNSCC),including the steps of: detecting an antibody reactive with apolypeptide marker antigen of claim 3 in the serum of a patientindicative of a stage of HNSCC with the diagnostic device of claim 3;and determining the stage of HNSCC.
 9. The method of claim 8, whereinsaid polypeptide marker antigens are chosen from the group consisting ofSEQ ID NOs: 3, 7, 11, 15, 19, 23, 27 and 31, wherein each of saidpolypeptides are expressed as an insert in frame with the T7 10B phagecapsid gene product as a fusion polypeptide.